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A plant consisting essentially of cells which comprise in their genome a homozygous male-sterility genotype at a first genetic locus; 
and a color-linked restorer genotype at a second genetic locus, which is heterozygous (Rf/-) for a foreign DNA Rf. The foreign DNA 
Rf comprises: a) a fertility-restorer gene capable of preventing the phenotypic expression of tiie male-sterility genotype, and b) at least 
one anthocyanin regulatory gene involved in the regulation of. antiiocyanin biosynthesis in cells of seeds of the plant which is capable 
of producing anthocyanin at least in the seeds of the plant, so that anthocyanin production in the seeds is visible externally. Preferably, 
the anthocyanin regulatory gene is a shortened R, B or CI gene or a continuation thereof. The invention also relates to DNA sequences 
encoding shortened R. B or CI antiiocyanin regulatory genes and to a process for maintaining a line of male-sterile plants which comprises 
crossing a male-sterile parent plant and a malntainer parent plant comprising homozygous male-sterility genotype and a restore genotype 
comprising fertility-restorer gene and an anthocyanin regulatory gene. . 
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USE OF ANTHOCYANIN GENES TO MAINTAIN MALE STERILE PLANTS 



The present invention relates to a method to maintain 
male-sterile plants that can be used for the production 
of hybrid seed of a plant crop species, to transgenic 
inbred plants that can be used in siich process, and to 
chimeric genes that can be used to produce such 
transgenic inbred plants. 

Background to the Invention 

In many, if not most plant species, the development 
of hybrid cultivars is highly desired because of their 
generally increased productivity due to heterosis: the 
superiority of performance of hybrid individuals compared 
with their parents (see e.g. Fehr, 1987, Principles of 
cultivar development, Volume 1 : Theory and Technique, 
MacMillan Publishing Company, New York; Allard, 1960, 
Principles of Plant Breeding, John Wiley and Sons, Inc.). 

The development of hybrid cultivars of various plant 
species depends upon the capability of achieving 
essentially almost complete cross-pollination between 
parents. This is most simply achieved by rendering one of 
the parent lines male sterile (i.e. bringing them in a 
condition so that pollen is absent or nonfunctional) 
either manually, by removing the anthers, or genetically 
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by using, in the one parent, cytoplasmic or nuclear genes 
that prevent anther and/or pollen development (for a 
review of the genetics of male sterility in plants see 
Kaul, 1988, 'Male Sterility in Higher Plants', Springer 
5 Verlag) . 

For hybrid plants where the seed is the harvested 
product (e.g. corn, oilseed rape) it is in most cases 
also necessary to ensure that fertility of the hybrid 
plants is fully restored. In systems in which the male 

10 sterility is under genetic control this requires the 

existence and use of genes that can restore male 
fertility. The development of hybrid cultivars is mainly 
dependent on the availability of suitable and effective 
sterility and restorer genes. 

15 Endogenous nuclear loci are known for most plant 

species that may contain genotypes which effect male 
sterility, and generally, such loci need to be homozygous 
for particular recessive alleles in order to result in a 
male-sterile phenotype. The presence of a dominant 'male 

20 fertile' allele at such loci results in male fertility. 

Recently it has been shown that male sterility can be 
induced in a plant by providing the genome of the plant 
with a chimeric male-sterility gene comprising a DNA 
sequence (or male-sterility DNA) coding, for example, for 

25 a cytotoxic product (such as an RNase) and under the 

control of a promoter which is predominantly active in 
selected tissue of the male reproductive organs. In this 
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regard stamen-specific promoters, such as the promoter of 
the TA29 gene of Nicotiana tabacum . have been shown to be 
particularly useful for this purpose (Mariani et al., 
1990, Nature 347:737, European patent publication ("EP") 
0,344,029). By providing the nuclear genome of the plant 
with such a male-sterility gene, an artificial male- 
sterility locus is created containing the artificial 
■male- sterility genotype that results in a ' male-sterile 
plant. 

In addition it has been shown that male fertility can 
be restored to the plant with a chimeric fertility- 
restorer gene comprising another DNA sequence (or 
fertility-restorer DNA) that codes, for example, for a 
protein that inhibits the activity of the cytotoxic 
product or otherwise prevents the cytotoxic product from 
being active in the plant cells (European patent 
publication "EP" 0,412,911). For example the barnase gene 
of Bacillu s amvloliauefaciens codes for an RNase, called 
barnase, which can be inhibited by a protein, barstar, 
that is encoded by the barstar gene of B. 
amvloliauefaciens . The barnase gene can be used for the 
construction of a sterility gene while the barstar gene 
can be used for the construction of a fertility-restorer 
gene. Experiments in different plant species, e.g. 
oilseed rape, have shown that a chimeric barstar gene can 
fully restore the male fertility of male sterile lines in 
which the male sterility was due to the presence of a 
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chimeric barnase gene (EP 0,412,911, Mariani et al., 
1991, Proceedings of the CCIRC Rapeseed Congress, July 9- 
11, 1991, Saskatoon, Saskatchewan, Canada; Mariani et 
al», 1992, Nature 357:384) • By coupling a marker gene, 
such as a dominant herbicide resistance gene (for example 
the bar gene coding for phosphinothricin acetyl 
transferase (PAT) that converts the herbicidal 
phosphinothricin to a non-toxic compound [De Block et 
al., 1987, EMBO J, 6:2513]), to the chimeric male- 
sterility and/or fertility-restorer gene, breeding 
systems can be implemented to select for uniform 
populations of male sterile plants (EP 0,344,029; EP 
0,412,911) . 

The production of hybrid seed of any particular 
cultivar of a plant species requires the: 1) maintenance 
of small quantities of pure seed of each inbred parent, 
and 2) the preparation of larger quantities of seed of 
each inbred parent. Such larger quantities of seed would 
normally be obtained by several (usually two) seed 
multiplication rounds, starting from a small quantity of 
pure seed ("basic seed") and leading, in each 
multiplication round, to a larger quantity of pure seed 
of the inbred parent and then finally to a stock of seed 
of the inbred parent (the "parent seed" or "foundation 
seed") which is of sufficient quantity to be planted to 
produce the desired quantities of hybrid seed. Of course, 
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in each seed multiplication round larger planting areas 
(fields) are required. 

In order to maintain and enlarge a small stock of 
seeds that can give rise to male-sterile plants it is 
necessary to cross the male sterile plants with normal 
pollen-producing parent plants. In the case in which the 
male-sterility is encoded in the nuclear genome, the 
offspring of such cross will in all cases be a mixture of 
male-sterile and male-fertile plants and the latter have 
to be removed from the former. With male-sterile plants 
containing an ' artificial male-sterility locus as 
described above, such removal can be facilitated by 
genetically linking the chimeric male sterility gene to a 
suitable marker gene, such as the bar gene, which allows 
the easy identification and removal of male-fertile 
plants (e.g. by spraying of an appropriate herbicide). 

However, even when suitable marker genes are linked 
to male-sterility genotypes, the maintenance of . parent 
male- sterile plants still requires at each generation 
the removal from the field of a substantial number of 
plants. For instance in systems using a herbicide 
resistance gene (e.g. the bar gene) linked to a chimeric 
male-sterility gene, as outlined above, only half of the 
parent stock will result in male- sterile plants, thus 
requiring the removal of the male-fertile plants by 
herbicide spraying prior to flowering. In any given 
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field, the removal of , male-fertile plants effectively 
reduces the potential yield of hybrid seed or the 
potential yield of male-sterile plants during each round 
of seed multiplication for producing parent seed. In 
addition removal of the male- fertile plants may lead to 
irregular stands of the male-sterile plants. For these 
reasons removal of the male-fertile plants is 
economically unattractive for many important crop species 
such as corn and oilseed rape. 



Anthocyanins are pigments that are responsible for 
many of the red and blue colors in plants. The genetic 
basis of anthocyanin biosynthesis has been well 
characterized, particularly in corn, Petunia . and 

15 Antirrhinium (Dooner et al, 1991, Ann. Rev. Genet. 25:179- 

199; Jayaram and Peterson, i99 0. Plant Breeding Reviews 
2:91-137; Coe, 1994, In 'The Maize Handbook' , Freeling 
and Walbot, eds. Springer Verlag New York Inc., p. 279- 
281) . In corn anthocyanin biosynthesis is apparently 

20 under control of 20 or more genes. The structural loci 

C2, Whp, Al, A2, Bzl, and Bz2 code for various enzymes 
involved in anthocyanin biosynthesis and at least 6 
regulatory loci, acting upon the structural genes, have 
been identified in corn i.e. the R, B, CI, PI, P and Vpl 

25 loci. 

The R locus has turned out to be a gene family (in 
corn located on chromosome 10) comprising at least three 
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different genes i.e. R (which itself may comprise 
duplicate genes organized in a tandem array) , and the 
displaced duplicate genes R(Sn) and R(Lc) . R typically 
conditions pigmentation of the aleurone but various 

5 alleles are known to confer distinct patterns of 

pigmentation. R(Lc) is associated with unique 
pigmentation of leaves and R(Sn) with unique pigmentation 
of the scutellar node. One state of R is associated with 
pigmentation of the whole plant (R(P)), while another is 

10 associated with pigmentation of the seeds (R(S)). 

Alleles of the unlinked B locus (in corn located on 
chromosome 2) rarely condition pigmentation of the 
aleurone, but are frequently associated with pigmentation 
15 of mature plant parts. The B-peru allele however, 

pigments the aleurone (like R(S)). Analysis at the 
molecular level has confirmed that the R and B loci are 
duplicate genes. 

20 In order that the R and B loci can color a particular 

tissue, the appropriate allele of CI or PI loci also 
needs to be present. The CI and Cl-S alleles, for 
instance, pigment the aleurone when combined with the 
suitable R or B allele. 



25 



Alleles of the CI locus have been cloned and 
sequenced. Of particular interest are CI (Paz-Ares et al, 
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1987, EMBO J. 6:3553-3558) and Cl-S (Schleffer et al, 
1994, Mol. Gen. Genet. 242 :40-48) Analysis of the 
sequences revealed the presence of two introns in the 
coding region of the gene. The protein encoded by the CI 
and Cl-S alleles shares homology with mvb prbto-oncogenes 
and is known to be a nuclear protein with DNA-binding 
capacity acting as transcriptional activators. 

The cDNA of the B-peru allele has also been analyzed 
' and sequenced (Radicella et al, 1991, Plant Mol. Biol. 
17:127-130). Genomic sequences of B-peru were also 
isolated and characterized based on the homology between 
R and B (Chandler et al., 1989, the Plant pell 1:1175- 
1183; Radicella et al., 1992, Genes & Development 6 : 2152- 
2164). The tissue specificity of ahthocyanin production 
of two different B alleles was shown to be due to 
differences in the promoter and untranslated leader 
sequences (Radicella et al, 1992, supra ) . 

Various alleles of the R gene family have also been 
characterized at the molecular level, e.g. Lc (Ludwig et 
al, 1989, PNAS 86:7092-7096), R-n j , responsible for 
pigmentation of the crown of the kernel (Dellaporta et 
al, 1988, In "Chromosome Structure and Function,: Impact 
of New Concepts, 18th Stadeler Genetics Symposium, 
Gustafson and Appels, eds. (New York, Plenum press, pp. 
263-282)), Sn (Consonni ei al, 1992, Nucl. Acids. Res. 
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20:373), and R(S) (Perrot and Cone, 1989, Nucl . Acids. 
Res. 17:8003) . 

The proteins encoded by the B and . R genes share 
homology with mvc proto-oncogenes and have 
characteristics of transcriptional activators, 

It has been shown that various structural and 
regulatory • genes, introduced in maize tissues by 
microprojectiles operate in a manner similar to the 
endogenous loci and can complement genotypes which are 
deficient in the introduced genes (Klein et al,, 1989, 
PNAS 86:6681-6685; Goff et al., 1990, EMBO J. 9:2517- 
2522) . The Lc gene was also used as a visible marker for 
plant transformation (Ludwig et al., 199 0, Science 
247:449- 450). Apart from the above other genes involved 
in anthocyanin biosynthesis have been cloned (Cone, 1994, 
In 'The Maize Handbook', Freeling and Walbot eds., 
Springer Verlag New York Inc., p. 282-285). 

In Barley, Falk et al (1981, In Barley Genetics IV, 
proceedings of the 4th International Barley Genetics 
symposium, Edinburgh University press, Edinburgh, pp, 
778-785) have reported the coupling of a male-sterile 
gene to a xenia-expressing shrunken endosperm gene which 
makes it possible to select seeds, before planting, that 
will produce male-sterile plants. Problems asociated with 
such proposal include complete linkage of the two genes 
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(Stoskopf, 1993, Plant Breeding : Theory and Practice, 
Westview Press, Boulder, San Francisco, Oxford) . In 
sweetcorn, a genetic system to produce hybrid corn seeds 
without detassling, which utilizes the closely linked 
5 genes y (white endosperm) and ms (male sterility) was 

suggested but was never used because of contamination 
from 5% recombination. Galinat (197 5, J.Hered. 66:387- 
388) described a two-step seed production scheme that 
resolved this problem by using electronic color sorters 

10 to separate yellow from white kernels . This approach has 

not been utilized commercially (Kankis and Davis, 1986, 
in « Breeding Vegetable Crops », the Avi Publishing 
Company Inc, Westport, Connecticut, U.S.A., p. 498). 

EP 6,198,288 and US Patent 4,717,219 describe methods, 

15 for linking marker genes (which can be visible markers or 

dominant conditional markers) to endogenous nuclear loci 
containing nuclear male-sterility genotypes, 

EP 412,911 describes foreign restorer genes (e.g. 
barstar coding region under control of a stamen-specific 

20 promoter) that are linked to marker genes, including 

herbicide resistance genes and genes coding for pigments 
(e.g. the Al gene) under control of a promoter which 
directs expression in specific cells, such as petal 
cells, leaf cells or seed cells, preferably in the outer 

25 layer of the seed. 
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Summary of the Invention 

The invention concerns a maintainer plant consisting 
essentially of cells which comprise in their genome: 

- a homozygous male-sterility genotype at a first genetic 
locus; and 

- a color-linked restorer genotype at a second genetic 
locus, which is heterozygous (Rf/-) for a foreign DNA Rf 
comprising: 

a) a fertility-restorer gene capable of preventing the 
phenotypic expression of said male-sterility 
genotype, and 

b) at least one anthocyanin regulatory gene involved in 
the regulation of anthocyanin biosynthesis in cells 
of seeds of said plant and which is capable of 
producing anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in the seeds 
is visible externally. 

The invention also concerns an anthocyanin regulatory 
gene which is a shortened R, B or CI gene or a 
combination of shortened R, B or CI genes which is 
functional for conditioning and regulating anthocyanin 
production in the aleurone. 
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The invention also includes a DNA such as a plasmid 
comprising a fertility-restorer gene capable of 
preventing the phenotypic expression of a male-sterility 
genotype in a plant and at least one anthocyanin 
regulatory gene involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of a plant and which is 
capable of producing anthocyanin at least in the seeds of 
a plant, so that anthocyanin production in the seeds is . 
, visible externally. 

Also within the scope of the invention is a process to 
maintain a line of male-sterile plants, which comprises 
the following steps: 



15 i) crossing: 



a) a male-sterile parent plant of said line having, 
in a first genetic locus, a homozygous male- 
sterility genotype, and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, stably 
intergrated in their nuclear genome: 

- a homozygous male-sterility genotype at a first 
genetic locus; and * 

- a colored-linked reistorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 



20 



25 
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i) a fertility-restorer gene capable of 
preventing the phenotypic expression of 
said male-sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said 
plant which is capable of producing 
anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in 
the seeds is visible externally, 

ii) obtaining the seeds from said parent plants, and 

iii) separating on the basis of color, the seeds in which 
no anthocyanin is produced and which grow into male- 
sterile parent plants. 

Preferably, the genome of the male-sterile parent plant 
does not contain at least one anthocyanin regulatory gene 
necessary for the regulation of anthocyanin biosynthesis 
in seeds of this plant to produce externally visible 
anthocyanin in the seeds. In one embodiment of the 
invention, the genome of the male-sterile parent plant 
contains a first anthocyanin regulatory gene and the 
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genome of the maintainer plant a -second anthocyanin 
regulatory gene which, when present with the first 
anthocyanin regulatory gene in the genome of a plant, is 
capable of conditioning the production of externally 
visible anthocyanin in seeds. 

The invention also concerns a process to maintain a line 
of maintainer plants, which comprises the following 
steps: 

i) crossing: 

a) a male-sterile parent plant as described 
previously, and 

b) a maintainer parent plant as described 
previously, 

ii) obtaining the seeds from said male-sterile parent 
plant, and 

iii) separating on the basis of color, the seeds in which 
anthocyanin is produced and which grow into 
maintainer parent plants. 
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The invention also relates to a kit for maintaining a 
line of male-sterile . or maintainer plants, said kit 
comprising: 

a) a male-sterile parent plant of said line as described 
previously, having, in a first genetic locus, a 
homozygous male-sterility genotype and which is 
incapable of producing externally visible anthocyanin 
in seeds, and 

b) a maintainer parent plant of said line as described 

previously. 

Also within the scope of the invention is a process to 
maintain the kit described previously which comprises: 

- crossing said male-sterile parent plant with said 
maintainer parent plant; 

- obtaining the seeds from said male-sterile parent 
plants and optionally the seeds from said maintainer 
parent plant in which no anthocyanin is produced; and 

- optionally growing said seeds into male-sterile parent 
plants and maintainer parent plants. 

As mentioned above, the present invention provides 
means to maintain a line of male-sterile plants, 
particularly corn or wheat plants. These means can be in 
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the form of a process which comprises the following 
steps: 

i) crossing A) a first parent plant of said line, which 
is male-sterile, and which is genetically characterized 
by the absence of at least one anthocyanin regulatory 
gene thereby being incapable of producing anthocyanin in 
seeds, particularly in the aleurone layer, and also by 
having at a first genetic locus a homozygous male- 
sterility genotype, and B) a second parent plant of said 
line, which is male-fertile, and which is genetically 
characterized by having at said first genetic locus, said 
homozygous male- sterility genotype, and at a separate 
second genetic locus the genotype Rf/-, 

whereby, 

Rf is, a foreign chimeric DNA (the "color-linked 
restorer gene") stably integrated in the nuclear genome 
of said plant which comprises: 

a) a fertility-restorer gene that is capable of 
preventing the phenotypic expression, i.e. the 
male- sterility, of said male-sterility genotype. 

b) said at least one anthocyanin regulatory gene (the 
"color gene") involved in the regulation of the 
anthocyanin biosynthesis in cells of seeds of said 
cereal plant which is capable of ' producing 
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anthocyanin at least in the seeds, particularly in 
the aleurone, of said cereal plant, 

ii) obtaining the keeds froB said first parent plants 

iii) separating, on the basis of color, the seeds in 
5 which no anthocyanin is produced and in which the 

genotype at said first genetic locus is said homozygous 
male-sterility genotype and the genotype at said second 
genetic locus is -/"f and the seeds in which anthocyanin 
is produced and in which the genotype at said first 
10 genetic locus is said homozygous male-sterility genotype 

and the genotype at said second genetic locus is Rf/-. 

Of particular interest in the invention is a second 
parent plant in which said at least one anthocyanin 

15 regulatory gene comprises a gene derived from a genomic 

clone of an R or B gene, particulaifly an R or B gene that 
conditions anthocyanin production in the aleurone, 
preferably the B-peru allele (e.g. the shortened B-peru 
gene in pC0L13) , and/or comprises a gene derived from a 

20 genomic clone of the CI gene (e.g. the gene with the 

sequence of SEQ ID NO 1 or SEQ ID NO 5) or the Cl-S gene. 

The first genetic locus can be endogenous to plants 
of said line (in which case the homozygous male-sterility 
25 genotype will be m/m) , but is preferably a foreign locus 

with genotype S/S in which S is a foreign DNA which, when 
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expressed in a plant is capable of rendering the plant 
male-sterile. A preferred foreign DNA comprises at least: 
si) a male-sterility DNA encoding a RNA, protein or 
polypeptide which, when produced or overproduced in a 
cell of the plant, significantly disturbs the 
metabolism, functioning, and/or development of the 
cell, and, 

s2)a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in stamen 
cells, preferably tapetum cells, of the plant; the 
male- sterility DNA being in the same transcriptional 
unit as, and under the control of, the sterility 
promoter. 

In case such a foreign male-sterility genotype is used, 
the fertility-restorer gene in the foreign DNA Rf 
preferably comprises at least: 

al) a fertility-restorer DNA encoding a restorer RNA, 
protein or polypeptide which, when produced or 
overproduced in the same stamen cells as said male- 
sterility gene S, prevents the phenotypic expression 
of said foreign male-sterility genotype comprising 
S, and, 

a2) a restorer promoter capable of directing expression 
of the fertility-restorer DNA at least in the same 
stamen cells in which said male-sterility gene S is 
expressed, so that the phenotypic expression of said 
male-sterility gene is prevented; the fertility- 



CONFIRMATION COPY 



wo 95/34634 PCT/EP95/02157 

19 

restorer DNA being in the same transcriptional unit 
as, and under the control of, the restorer promoter. 
In case of an endogenous male-sterility genotype which is 
homozygous for the recessive male-sterility allele m, the 
fertility restorer gene is preferably a DNA comprising 
the dominant allele M of said locus. 

The present invention also provides the novel foreign 
chimeric DNA Rf as used in the second parent plants, 
plasmids comprising these chimeric genes, and host cells 
comprising these plasmids. 

The present invention also provides the shortened B- 
peru gene in pCOL13 (SEQ ID NO 6) and the shortened CI 
gene, particularly the EcoRl-sfil fragment of pC0L9 of 
SEQ ID NO 5. 

The present invention further provides plants the 
nuclear genome of which is transformed with the foreign 
chimeric DNA Rf, particularly the second parent plant. 

Detailed Descript ion of the Invention 

A male-sterile plant is a plant of a given plant 
species which is male-sterile due to expression of a 
male-sterility genotype such as a foreign male-sterility 
genotype containing a male-sterility gene. A restorer 
plant is a plant of the same plant species that contains 
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within its genome at least one fertility-restorer gene 
that is able to restore the male .fertility in those 
offspring obtained from a cross between a male-sterile 
plant and a restorer plant and containing both a male- 
sterility genotype and a fertility-restorer gene. A 
restored plant is a plant of the same species that is 
male- fertile and that contains within its genome a male^ 
sterility genotype and a fertility-restorer gene. 

A line is the progeny of a given individual plant. 

A gene as used herein is generally understood to 
comprise at least one coding region coding for an RNA, 
protein or polypeptide which is operably linked to 
' suitable promoter and 3' regulatory sequences. A 
structural gene is a gene whose product is a e.g. an 
enzyme, a structural protein, tRNA or rRNA. For example 
anthocyanin structural genes encode enzymes (e.g. 
chalcone synthase) directly involved in the biosynthesis 
of anthocyanihs in plant cells. A regulatory gene is a 
gene which encodes a regulator protein which regulates 
the transcription of one or more structural genes. For 
example the R, B, and CI genes are regulatory genes that 
regulate transcription of anthocyanin structural genes. 

For the purpose of this invention the expression of a 
gene, such as a chimeric gene, will mean that the 
promoter of the gene directs transcription of a DNA into 
a mRNA which is biologically active i.e. which is either 
capable of interacting with another RNA, or which is 
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capable of being translated into a biologically active 
polypeptide or protein. 

The phenotype is the external appearance of the 
expression (or lack of expression) of a genotype i.e. of 
a gene or set of genes (e.g. male-sterility, seed color, 
presence of protein or RNA in specific plant tissues 
etc . ) 

As used herein, a genetic locus is the position of a 
given gene in the nuclear genome, i.e. in a particular 
chromosome, of a plant. Two loci can be on different 
chromosomes and will segregate independently. Two loci 
can be located on the same chromosome and are then 
generally considered as being linked (unless sufficient 
^ recombination can occur between them). 

An endogenous locus is a locus which is naturally 
present in a plant. A foreign locus is a locus which is 
formed, in the plant because of the introduction, by means 
of genetic transformation, of a foreign DNA. 

In diploid plants, as in any other diploid organisms, 
two copies of a gene are present at any autosomal locus. 
Any gene can be present in the nuclear genome in several 
variant states designated as alleles. If two identical 
alleles are present at a locus that locus is designated 
as being homozygous, if different alleles are present, 
the locus is designated as being heterozygous. The 
allelic composition of a locus, or a set of loci, is the 
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genotype. Any allele at a locus is generally represented 
by a separate symbol (e.g. M and m, S and 
representing the absence of the gene) . A foreign locus is 
generally characterized by the presence and/or absence of 

5 a foreign DNA. A heterozygous genotype in which one 

allele correspohds to the absence of the foreign DNA is 
also designated as hemizygous (e.g. Rf/-) . A dominant 
allele is generally represented by a capital letter and 
is usually associated with the presence of a biologically 

10 active gene product (e.g. a protein) and an observable 

phenotypic effect (e.g. R indicates the production of an 
active regulator protein and under appropriate conditions 
anthocyanin production in a given tissue while r 
indicates that no active regulator protein is produced 

15 possibly leading. to absence of anthocyanin production). 

A plant can be genetically . characterized by 
identification of the allelic state of at least one 
genetic locus. 

20 The genotype of any given locus can be designated by 

the symbols for the two alleles that are present at the 
locus (e.g. M/m or m/m or S/-) . The genotype of two 
unlinked loci can be represented as a sequence of the 
genotype of each locus (e.g. S/S, Rf/-) 
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The nuclear male-sterility genotype as used in this 
invention referis to the genotype of at least one locus. 
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plant (the "male-sterility locus") the allelic 
composition of which may result in male sterility in the 
plant. A male-sterility locus may be endogenous to the 
plant, but it is generally preferred that it is foreign 
to the plant. 

Foreign male-sterility loci are those in which the 
allele responsible for male sterility is a foreign DNA 
sequence S (the "male-sterility gene") which when 
expressed in cells of the plant make the plant male- 
sterile without otherwise substantially affecting the 
growth and development of the plant. Such male-sterility 
gene preferably comprises, at least: 

si) a male-sterility DNA encoding a sterility RNA, 
protein or polypeptide which, when produced or 
overproduced in a stamen cell of . the plant, 
significantly disturbs the metabolism, functioning 
and/ or development of the stamen cell, and, 

s2) a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in stamen cells 
(e.g. anther cells or tapetum ceils) of the plant; 
the male-sterility DNA being in the same 
transcriptional unit as, and under the control of, 
the sterility promoter. 
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The male-sterility locus preferably also comprises in 
the same genetic locus at least one first marker gene T 
which comprises at least: 

tl) a first marker DNA encoding a first marker RNA, 
5 protein or polypeptide which, when present at least 

in a specific tissue or specific cells of the plant, 
renders the plant easily separable from other plants 
which do not contain the first marker RNA, protein or 
polypeptide encoded by the first marker DNA at least 
0 in the specific tissue or specific cells, and, 

t2) a first marker promoter capable of directing 
expression of the first marker DNA at least in the 
specific tissue or specific cells: the first marker 
DNA being in the same transcriptional unit as, and 
5 under the control of, the first marker promoter. 

Such male-sterility gene is always a dominant allele 
at such a foreign male-sterility locus. The recessive 
allele corresponds to the absence of the male-sterility 
^ gene in the nuclear genome of the plant. 

Male-sterility DNAs and sterility promoters that can 
be used in the male-sterility genes in the first parent 
line of this invention have been described before (EP 
> 0,344,029 and EP 0,412,911). For the purpose of this 

invention the expression of the male-sterility gene in a 
plant cell should be able to be inhibited or repressed 
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for instance by means of expression of a suitable 
fertility-restorer gene in the same plant cell. In this 
regard a particular useful male-sterility DNA codes for 
barnase (Hartley, J.Mol, Biol. 1988 202:913). The 

5 sterility promoter can be any promoter but it should at 

least be active in stamen cells, particularly tapetum 
cells. Particularly useful sterility promoters are 
promoters that are selectively active in stamen cells, 
such as the tapetum-specif ic promoters of the TA29 gene 

10 of Nicotiana tabacum (EP 0 , 34 4 , 029) . which can be used in 

tobacco, oilseed rape, lettuce, cichory, corn, rice, 
wheat and other plant species; the PT72, the PT42 and PEl 
promoters from rice which can be used in rice, corn, 
wheat, and other plant species (WO 92/13956) ; the PCA55 

15 promoter from corn which can be used in corn, rice, wheat 

and other plant species (WO 92/13957) ; and the A9 
promoter of a tapetum- specific gene of Arabidopsis 
thaliana (Wyatt et al., 1992, Plant Mol. Biol. 19:611- 
922). However, the sterility promoter may also direct 

20 expression of the sterility DNA in cells outside the 

stamen; particularly if the effect of expression of the 
male-sterility DNA is such that it will specifically 
disturb the metabolism, functioning and/or development of 
stamen cells so that no viable pollen is produced. One 

25 example of such a male-sterility DNA is the DNA coding 

for an antisense RNA which is complementary to the mRNA 
of the chalcone synthase gene (van der Meer et al (1992) 
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The Plant Cell 4:253-262). In this respect a useful 
promoter is the 35S promoter (see EP 0,344,029), 
particularly a 3 5S promoter that is modified to have 
enhanced activity in tapetum cells as described by van 
der Meer et al (1992) The Plant Cell 4:253-262 (the "35S- 
tap promoter"). 

A preferred endogenous male-sterility locus is one in 
which a recessive allele , (hereinafter designated as m) in 
homozygous condition (m/m) results in male sterility. At 
such loci male fertility is encoded by a corresponding 
dominant allele (M) . In many plant species such 
endogenous male- sterility loci are known (see Kaul, 
1988, supra (in corn see also recent issues of Maize 
Genetics Cooperation Newsletter,, published by Department 
of Agronomy and' U.S. Department of Agriculture, 
University Of Missouri, Columbia, Missouri, U.S.A.). The 
DNA sequences in the nuclear genome of the plant 
corresponding to m and M alleles can be identified by 
gene tagging i.e. by insertional mutagenesis using 
transposons, or by means of T-DNA integration (see e.g. 
Wienand and Saedler, 1987, In 'Plant DNA Infectious 
Agents ' , Ed . by T • H . Hohn and J . Schell , Springer Verlag 
Wien New York, p. 205; Shepherd, 1988, In 'Plant 
Molecular Biology: a Practical Approach', IRL Press, p. 
187; Teeri et al., 1986, EMBO J. 5:1755). It will be 
evident that in the first and second parent plant of this 
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invention S/S can be replaced by m/m without affecting 
the outcome of the process. Indeed, one feature of the 
process of this invention is that the male-sterility 
locus is homozygous thus allowing the use of 'recessive' 
male-sterility alleles. 

Fertility-restorer DNAs that can be used in the 
fertility restorer gene in the second parent line of this 
invention have been described before (EP 0,412,911), 

In this regard, fertility-restorer genes in which the 
fertility-restorer DNA encodes barstar (Hartley, J.Mol, 
Biol, 1988 202:913) are particularly useful to inhibit 
the expression of a male-sterility DNA that encodes 
barnase. In this regard it is believed that a fertility- 
restorer DNA that codes for a mutant of the barstar 
protein, i.e. one in which the Cysteine residue at 
position 40 in the protein is replaced by serine 
(Hartley, 1989, TIBS 14:450), functions better in 
restoring the fertility in the restored plants of some 
species. 

In principle any promoter can be used as a restorer 
promoter in the fertility restorer gene in the second 
parent line of this invention. The only prerequisite is 
that such second parent plant, which contains both the 
color gene and the fertility-restorer gene, should be. 
phenotypically normal and male-fertile. This requires 
that the restorer promoter in the fertility-restorer gene 
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should be at least active in those cells of a plant of 
the same species in which the sterility promoter of the 
corresponding inale-sterility gene can direct expression 
of the male-sterility DNA. In this regard it will be 
preferred that the sterility promoter and the restorer 
promoter are the same; they can for example be both 
stamen-specific promoters (e.g. the TA29 promoter or the 
CA55 promoter) or they can be both constitutive promoters 
(such as the 35S or 35S-tap promoter). However, the 
sterility promoter may be active only in stamen cells 
while the restorer promoter is also active in other 
cells. For instance, the sterility promoter can be a 
stamen-specific (such as the TA29 or CA55 promoter) while 
the restorer promoter is the 3 5S-tap promoter. 

When the male sterility to be restored is due to the 
male- sterility genotype at an endogenous male-sterility 
locus being homozygous for a recessive allele m, it is 
preferred that the fertility-restorer gene is the 
dominant allele of that male- sterility locus, preferably 
under control of its own promoter. The DNA corresponding 
to such a dominant allele, including its natural promoter 
can be isolated from the nuclear genome of the plant by 
means of gene tagging as described above. 

The nature of the color gene that is used in the 
color- linked restorer gene in the second parent. plant of 
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this invention depends upon the genotype of the 
untrans formed plants of the same line. Preferably, only 
cereal plants with a genotype that does not condition 
externally visible anthocyanin production in seeds, 

5 particularly in the aleurone can be used to produce the 

second parent plants. These plants usually have a 
genotype in which no functional copy of a suitable 
regulatory gene such as the R or B gene, and/or the CI 
gene, is present. 

^ In corn, for instance, all of the currently used 

inbred lines in the U.S.A. are r-r (pink anthers, leaf 
tips, plant base) or r-g (green) and most of these are cl 
and pi; at the B- locus the B-peru allele is very rare 
(Coe et al, 1988, In »Corn and Corn Improvement', 3rd 

5 edition, G.F.Sprague and J.w. Dudley, eds. America 

Science of Agronomy, Inc. Publishers, Madison, Wisconsin, 
U.S.A.). The result is that no anthocyanins are produced 
in the aleurone of these lines and that the kernels are 
yellow. This requires that when these lines are 

0 transformed with a color-linked restorer gene, the color 

gene should consist of a functional R or B gene which 
conditions anthocyanin production in aleurone, and 
usually also a functional Cl gene capable of conditioning 
anthocyanin production in aleurone. 

5 A useful R or B gene is the B-peru gene, but of 

course also other R genes could be used such as the R(S) 
gene (Perrot and Cone, 1989, Nucl. Acids Res. 17:8003). 
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In this regard a gene derived from genomic clones of the 
B-peru gene (Chandler et al, 1989, The Plant Cell 1:1175- 
1183) is believed to be particularly useful. However the 
length of this genomic DNA (11 kbp) renders its practical 
manipulation and use for transformation by direct gene 
transfer, difficult, certainly in combination with other 
genes such as the restorer gene and the CI gene. 

In one inventive aspect of this invention it was 
found that the B-peru gene could be considerably 
shortened while still retaining, under appropriate 
conditions, its capability of conditioning anthocyanin 
production in the aleurone of seeds of cereal plants such 
as corn. A preferred shortened B-peru gene is that of 
Example 2.2 and which is contained in plasmid pC0L13 
(deposited under accession number LMBP 3041) . 

A useful CI gene is the genomic clone as described by 
Paz-Ares et al, 1987, EMBO J. 6:3553-3 558. However the 
length of this genomic DNA (4 kbp) precludes its 
practical manipulation and use for transformation by 
direct gene transfer, certainly in combination with other 
genes such as the restorer gene and the B-peru gene . 
Nevertheless other variants of the CI gene can also be 
used. In this regard Scheffler et . al, 1994, 
Mol. Gen. Genet. 242:40-48 have described the Cl-S allele 
which differs from the CI allele of Paz-Ares et al, supra 
by a few nucleotides in the promoter region near the CAAT 
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box and which is dominant to the wild-type allele (CI) 
and shows enhanced pigmentation. The Cl-S gene can be 
easily used in this invention by appropriate changes in 
the CI gene. For example the TGCAG at positions 93 5 to 
5 939 in SEQ ID NO 1 (respectively at positions 884-888 in 

SEQ ID NO 5) can be easily changed to TTAGG yielding a 
Cl-S allele (respectively pC0L9S) . 

In one inventive aspect of this invention it was 
found that the CI gene (and the Cl-S gene) could be 
10 considerably shortened while still retaining, under 

appropriate conditions, its capability of conditioning 
arithocyanin production in the aleurone of seeds of cereal 
plants such as corn. Preferred shortened CI genes for 
instance are those of Example 2 . 1 such as comprised in 
15 pC0L9 which has the sequence of SEQ ID NO 5, particularly 

as comprised between the EcoRI and Sfil sites of pC0L9, 
and the corresponding shortened Cl-S gene in pC0L9S. 

The transcribed region of the shortened B-peru and CI 
genes still contain some small introhs which can also be 
deleted without affecting the function of the genes. It 
is also believed that the shortened B-peru and CI genes 
can be somewhat further truncated at their 5* and 3» 
ends, without affecting their expression in aleurone. In 
particular it is believed that the sequence between 
positions 1 and 3272 of SEQ ID NO 6 can also be used as a 
suitable B-peru gene. It is also believed that this gene 
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can still be truncated ' at its 3' end down to a position 

between nucleotides 2940 and 3000 of SEQ ID No. 6. 

Although the use of genomic sequences of the B-peru 
gene and the CI gene, particularly the shortened B-peru 
and/or the shortened CI of Cl-S genes, is preferred, 
chimeric R, B, or CI. genes can also be used. For instance 
a chimeric gene can be used which comprises the coding 
region (e.g, obtained from the cDNA) of any functional R 
or B gene (i.e. which conditions anthocyanin production 
anywhere in the plant) which is operably linked to the 
promoter region of a R or B gene which conditions 
anthocyanin production in the aleurone (such as R(S) or 
B- peru) , Since the presence of anthocyanin does not 
negatively affect growth, development and functioning of 
plant cells, a constitutive promoter (e.g. the 35S 
promoter) , or a promoter which directs expression at 
least in the aleurone can also be used in such a chimeric 
gene. In this regard the promoter of the CI gene can also 
be used to direct expression of a DNA comprising the 
coding region of suitable R or B gene, particularly the 
B-peru gene. 

Similarly the coding region (e.g. obtained from cDNA) 
of the CI gene can be operably linked to the promoter of 
a gene that directs expression at least in the aleurone. 
In this regard, the promoter of the B-peru gene can also 
be used to direct expression of a DNA comprising the 
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coding region of a suitable CI gene such as that of the 
CI gene of SEQ ID No. 1 or of the Cl-S gene. 

In another inventive aspect of the invention it was 
found that the the promoters comprised in DNAs 
characterized by the sequences between positions 1 to 
1077, particularly between positions 447 and 1077, quite 
particularly between positions 447 and 1061 of SEQ ID NO 
1, between positions 396 and 1026 of SEQ ID NO 5, and 
between positions 1 to 575, particularly between position 
1 to 188 of SEQ ID NO 6 are promoters that predominantly, 
if not selectively, direct expression of any DNA, 
preferably a heterologous DNA in the aleurone layer of 
the seeds of plants. 

Of course in those lines in which a functional 01 
gene is already present in the genome the color gene can 
consist only of a suitable functional R or B gene (or a 
chimeric alternative) . Alternatively if a line contains 
already a functional R or B gene which can condition 
anthocyanin production in the aleurone, but no functional 
CI gene, only a functional CI gene is required as a color 
gene. 

It is believed that the color genes of this invention 
are especially useful in cereal plants, and that they are 
of particular use in corn and wheat, and certainly in 
corn. 
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For the purposes of this invention it is preferred 
that, in the second parent plants the "Rf»* locus and the 
male- sterility (e.g. "S") locus are not linked and 
segregate separately. 

In the second parent plant, the fertility restorer 
gene, the B-peru gene and the CI gene are preferably 
closely linked. This can of course be achieved by 
introducing' these genes in the nuclear genome of the 
plants as a single transforming foreign DNA (the Rf DNA) 
thus forming a foreign Rf locus. Alternatively, the 
fertility restorer gene and the color gene can be 
separately introduced by cotransf ormation which usually 
results in single locus insertions in the plant genome. 

The color-linked restorer gene Rf as used in the 
second parent plant preferably also comprises at least 
c) a second marker gene which comprises at least: 

cl) a second marker DNA encoding a second marker RNA, 
protein or polypeptide which, when present at least in a 
specific tissue or specific cells of the plant, renders 
the plant easily separable from other plants which do not 
contain the second marker RNA, protein or polypeptide 
encoded by the second marker DNA at least in the specific 
tissue or specific cells, and, 
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c2) a second marker promoter capable of directing 
expression of the second marker DNA at least in the 
specific tissue or specific cells: the second marker DNA 
being in the same transcriptional unit as, and under the 
control of, the second marker promoter. 

First and second marker DNAs and first and second 
marker promoters that can be used in the first and second 
marker genes of this invention are also well known (EP 
0,344,029; EP 0,412,911). In this regard it is preferred 
that the first and second marker DNA are different, 
although the first and second marker promoter may be the 
same. 

Foreign DNA such as the fertility-restorer gene, the 
foreign male-sterility gene, the B-peru and the CI genes, 
or the first or second marker gene preferably also are 
provided with suitable 3 ' transcription regulation 
sequences and polyadenylation signals, downstream (i.e. 
3*) from their coding sequence i.e. respectively the 
fertility-restorer DNA, the male-sterility DNA, the 
coding region of a color gene (such as a B-peru gene 
and/ or a CI gene) or the first or second marker DNA. In 
this regard either foreign or endogenous transcription 3 ' 
end formation and polyadenylation signals suitable for 
obtaining expression of the chimeric gene can be used. 
For example, the foreign 3' untranslated ends of genes, 



CONFIRMATION COPY 



wo 95/34634 PCr/EP95/02157 

36 

such as gene 7 (Velten and Schell (1985) Nucl. Acids Res. 
13:6998); the octopine synthase gene (De Greve et al-, 
1982, J.Mol. Appl- Genet. 1:499.; Gielen et al (1983) EMBO 
J. 3:835; Ingelbrecht et al., 1989, -^The Plant Cell 1:671) 
and the nopaline synthase gene of the T-DNA region of 
Aqrobacterium tumefaciens Ti-plasiaid (De Picker et al., 
1982, J.Mol. Appl. Genet. 1:561), or the chalcon synthase 
gene (Sommer and Saedler, 1986, Mol .Gen. Genet. 202:429- 
434) , or the CaMV 19S/35S transcription unit (Mogen et 
al., 1990, The Plant Cell 2:1261-1272) can be used. 
However, it is preferred that the color genes in this 
invention carry their endogenous transcription 3' end 
formation and polyadenylation signals. 

The fertility-restorer gene, the male-sterility gene, 
the color gene or the first or second marker gene in 
accordance with the present invention are generally 
foreign DNAs, preferably foreign chimeric DNA. In this 
regard "foreign" and "chimeric" with regard to such DNAs 
have the same meanings as described in EP 0,344,029 and 
EP 0,412,911. 

The cell of a plant, particularly a plant capable of 
being infected with Aqrobacterium such as most 
dicotyledonous plants (e.g. Brassica napus ) and some 
monocotyledonous plants, can be transformed using a 
vector that is a disarmed Ti-plasmid containing the male- 
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sterility gene, the color linked restorer gene or both 
and carried by Aarobacter ium . This transformation can be 
carried out using the procedures described, for example, 
in EP 0,116,718 and EP 0,270,822. Preferred Ti-plasmid 

5 vectors contain the foreign DNA between the border 

sequences, or at least located to the left of the right 
border sequence, of the T-DNA of the Ti-plasmid. Of 
course, other types of vectors can be used to transform 
the plant cell, using procedures such as direct gene 

10 transfer (as described, for example, in EP 0,233,247), 

pollen mediated transformation (as described, for 
example, in EP 0,270,356, PCT patent publication "WO" 
85/01856, and US patent 4,684,611), plant RNA virus- 
mediated transformation (as described, for example, in EP 

15 0,067,553 and US patent 4,407,956) and liposome-mediated 

transformation (as described, for example, in US patent 
4,536,475). Cells of monocotyledonous plants such as the 
major cereals including corn, rice, wheat, barley, and 
rye, can be transformed (e.g. by electroporation) using 

20 wounded or enzyme-degraded intact tissues capable of 

forming compact embryogenic callus (such as immature 
embryos in corn) , or the embryogenic callus (such as type 
I callus in corn) obtained thereof, as described in WO 
92/09696. In case the plant to be transformed is com, 

25 other recently developed methods can also be used such 

as, for example, the method described for certain lines 
of corn by Fromm et al., 1990, Bio/Technology 8:833; 
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Gordon-Kamm et al., 1990, Bio/Technology 2:603 and Gould 
et al., 1991, Plant Physiol. 95:426» In case the plant to 
be transformed is rice , recently developed methods can 
also be used such as, for example, the method described 
for certain lines of rice by Shimamoto et al., 1989, 
Nature 338:274; Datta et al., 1990, Bio/Technology 8:736; 
and Hayashimoto et al., 1990, Plant Physiol. 93:857. 

The transformed cell can be regenerated into a mature 
plant and the resulting transformed plant can be used in 
a conventional breeding scheme to produce more 
transformed plants with the same characteristics or to 
introduce the male- sterility gene, the color-linked 
restorer gene (or both) , in other varieties of the same 
related plant species. Seeds obtained from the 
transformed plants contain the chimeric gene(s) of this 
invention as a stable genomic insert. Thus the male- 
sterility gene, or the color-linked restorer gene of this 
invention when introduced into a particular line of a 
plant species can always be introduced into any other 
line by backcrossing. 

The first parent plant of this invention contains the 
male-sterility gene as a stable insert in its nuclear 
genome (i.e. it is a male-sterile plant) . For the 
purposes of this invention it is preferred that the first 
parent plant contains the male-sterility gene in 
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homozygous condition so that it transmits the gene to all 
of its progeny. 

The second parent plant of this invention contains 
the male^sterility gene and the color-linked restorer 
gene as stable inserts in its nuclear genome (i.e. it is 
a restored plant) . It is preferred that the male- 
sterility gene be in homozygous condition so that the 
second parent plant transmits the gene to all of its 
progeny and that the color-linked restorer gene be in 
heterozygous condition so that the second parent plant 
transmits the gene to only half of its progeny. 

It is preferred that the first and second parent 
plants are produced from the same untransformed line of a 
plant species > particularly from the same . inbred line of 
that species. 

The first and second parent plants of this invention 
have the particular advantage that seeds of such plants 
can be maintained indefinitely, and can be amplified to 
any desired amount (e.g. by continuous crossing of the 
two plant lines) . 

The color genes of this invention can be used as 
marker gene in any situation in which it is worthwhile to 
detect the presence of a foreign DNA (i.e. a transgene) 
in seeds of a transformed plant in order to isolate seeds 
which possess the foreign DNA. In this regard virtually 



CONFIRMAnON COPY 



wo 95/34634 PCT/EP95/02157 

40 

any foreign DNA, particularly a chimeric gene can be 
linked to the color gene. 

Examples of such foreign DNAs are genes coding for 
insecticidal (e.g. from Bacillus thurinaiensis ) , 
fungicidal or nematocidal proteins. Similarly the* colbr- 
gene can be linked to a foreign DNA which is the male- 
sterility gene as used in this invention. 

However, the color genes are believed to be of 
particular use in the process of this invention in which 
they are present in a foreign DNA which comprises a 
fertility restorer gene (such as the barstar gene of 
Bacillus amvloliauefaciens ) under control of a stamen- 
specific promoter (such as PTA29) . In appropriate 
conditions the use of the color genes allows the easy 
separation of harvested seeds that will grow into male- 
sterile plants, and harvested seeds that will grow into 
male-fertile plants. In this regard the seeds are 
preferably harvested from male-sterile plants (the first 
parent plants) that are homozygous at a male-sterility 
locus (such as a locus comprising the barnase gene under 
control of PTA29) and which have been pollinated by 
restorer plants (the second parent plants of this 
invention) which contain in their genome two unlinked 
gene loci one of which comprises the same male- sterility 
locus which is homozygous for the same male-sterility 
gene while the other is a foreign locus which comprises 
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an appropriate fertility restorer gene (i.e. whose 
expression will counteract the expression of the male- 
sterility gene) and also the color gene of this 
invention, particularly an R or B gene that is expressed 
in the aleurone and/or a CI gene, preferably the B-peru 
and CI gene (e.g. as described in the examples) . First 
and second parent plants can be essentially produced as 
described in the examples and as summarized in Figure 1. 
' In step 8 of Figure 1 it is demonstrated that the 
crossing of the first and second parent plants of this 
invention will give rise in the progeny to about 50% new 
first parent (i.e. male- sterile) plants and about 50% 
new second parent (i^e. male- fertile) plants and that 
these two types of plants can already be separated at the 
seed stage on the basis of color. Red kernels will grow 
into male-fertile plants while yellow kernels will grow 
into male-sterile plants. 

Thus a line of male-sterile first parent plants of 
this invention can be easily maintained by continued 
crossing with the second parent plants of this invention 
with, in each generation, harvesting the seeds from the 
male-sterile plants and separation of the yellow and red 
kernels. Of course in this way any desired amount of seed 
for foundation seed production of a particular line, such 
as an inbred line, can also be easily obtained. 

The red and yellow seeds harvested from a cereal 
plant (e.g. the first parent plant of this invention) can 
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be separated manually. However, such separation can also 
be effected mechanically. A color sorting machine for 
corn kernel and other granular products is for instance 
available from Xeltron U.S. (Redmond, Washington, U.S.A.) 

Unless otherwise indicated all experimental 
procedures for manipulating recombinant DNA were carried 
out by the standardized procedures described in Sambrook 
et al., 1989, "Molecular Cloning: a Laboratory Manual", 
Cold Spring Harbor Laboratory , and Ausubel et al, 1994, 
"Current Protocols in Molecular Biology", John Wiley & 
Sons. 

The polymerase chain reactions ("PGR") were used to 
clone and/or amplify DNA fragments. PCR with overlap 
extension was used in order to construct chimeric genes 
(Horton et al, 1989, Gene 77:61-68; Ho et al, 1989, Gene 
77:51-59). 

All PCR reactions were performed under conventional 
conditions using the Vent'^M polymerase (Cat. No. 254L - 
Biolabs. New England, Beverley, MA 01915, U.S.A.) isolated 
from Thermococcus literal iis (Neuner et al. , 1990, 
Arch. Microbiol. 153:205-207). Oligonucleotides were 
designed according to known rules as outlined for example 
by Kramer and Fritz (1968, Methods in Enzymology 
154:350), and synthesized by the phosphoramidite method 
(Beaucage and Caruthers, 1981, Tetrahedron Letters 
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22:1859) on an applied Biosystems 380A DNA synthesizer 
(Applied Biosystems B.V, , Maarssen, Netherlands). 

In the following examples, reference will be made to 
the following sequence listing and figures: 

Secatence Listing 

: sequence of CI gene 
: plasmid pTS256 

: EcoRI-Hindlll region of pTS200 
comprising the chimeric gene 
PCA55-barstar-3 'nos (the omitted 
region of pTS200 is derived from 
pUC19 • 

SEQ ID NO 4 : oligonucleotide 1 

SEQ ID NO 5 : pC0L9 containing the shortened CI 

gene as a EcoRI-Sfil fragment 
SEQ ID NO 6 : presumed sequence of the EcoRI- 

Hindlll region of pC0L13 
containing the shortened B-peru 
gene (the rest of the plasmid is 
pUC19). The stretch of N 
nucleotides corresponds to a 
region of approximate length 
which is derived from the genomic 
clone of the B-peru gene but for 
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which the sequence needs to be 
confirmed. 

actual sequence of . the EcoRl- 
Hindlll region of pC0L13 
containing the shortened B-peru 
gene (the rest of the plasmid is 
pUC19). 

Figures 

Figure 1 : Breeding scheme to obtain the first and 

second parent plants of this invention 

Figure 2 : Schematic structure of pCOL25, pCOL26, 

pCOL27, PCOL28, pCOLlOO and pDEllO. 

Examples 

Example 1 : Construction of plasmids containing the male- 
sterilitv gene comprising the TA29 promoter and the 
barnase coding region 

Plasmids useful for transformation of corn plants and 
carrying a male-sterility gene and a selectable marker 
gene have been described in WO 92/09696 and WO 92/00275. 

Plasmid pVE107 contains the following chimeric genes: 
1) PTA29- barna5e -3 ' nos ^ i.e. a DNA coding for barnase of 
Bacillus amvloliguefaciens ( barnase ) operably linked to 
the stamen-specific promoter of the TA29 gene of 
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NicQtiana tabacuin (PTA29) and the 3* regulatory sequence 
containing the polyadenylation signal of the nopaline 
synthase gene of Agroba cter ium tume f ac iens (3"nos), and 
2) P35S-neo-3 'ocs, i.e. the coding region of the gene of 
Tn5 of E.coli coding for neomycin phosphotransferase 
(neo) operably linked to the 35S promoter of Cauliflower 
Mosaic Virus (P35S) and the 3' regulatory sequence 
containing the polyadenylation signal of the. octopine 
synthase gene of Agrobacterium tumefaciens (3'ocs), 

Plasinid pVElOS contains the following chimeric genes: 
1) PTA29- barnase -3 ' nos . and 2) P35S-bar-3 'nos, i.e. the 
gene of Streptomvces hvaroscopicus (EP 242236) coding for 
phosphinothricin acetyl transferase ( bar ) operably linked 
to the P35S and 3 'nos. 

PTA29^barnase-3 'nos is an example of a foreign chimeric 
male-sterility gene (S) used in this invention. 

Example 2 : Construction of a plasmid containing the 
color-linked restorer gene 

2.1, Obtaining a shortened functional CI gene 

The CI gene of maize was cloned from transposable- 
induced mutants and its sequence was reported (Paz-Ares, 
1987, EMBO J. 6:3553-3558). This sequence is reproduced 
in SEQ ID NO. 1. Plasmid p36 (alternatively designated as 
pClLC5kb and further designated as plasmid pXX036) 
comprising a CI genomic clone was obtained from Dr. H. 
Saedler and Dr. U. Wienand of the Max- Planck Institut 
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fur Ziichtungsf or s Chung, Koln, Gennany. pXX036 was 
digested with SnabI and Hindlll, ,filled-in with Klenow, 
and selfligated, yielding plasmid pC0L9. pC0L9 
corresponds to pUC19 (Yanisch-Perron et al, 1985, Gene 
5 33:103-119) which contains, between its EcoRI. and 

modified Hindlll sites, the 2189 bp EcoRI^Snabl fragment 
(corresponding to the sequence between positions 448 and 
2637 of SEQ ID NO 1) of pXX036. 

10 pXX036 was also digested with Sfil and Hindlll and 

treated with Klenow to make blunt ends. After ligation 
the plasmid in which the DNA downstream -from the Sfil 
site was deleted was designated as pC0L12. 

15 The sequence TGCAG in pC0L9, corresponding to the 

sequence at positions 884 to 888 in SEQ ID NO 5, is 
changed to TTAGG, yielding pC0L9S which instead of a 
shortened CI gene contains a shortened overexpressing Cl- 
S gene (Schleffer et al, 1994, Mol .Gen. Genet. 242:40-48). 

20 A similar change is introduced in pC0L12, yielding 

PC0L12S. 

2.2. Obtaining a shortened functional B-peru gene 

Plasmid pBP2 (further designated as pXX004) is 
25 plasmid pTZlsU (Mead et al., 1986, Protein Engineering 

1:67; U.S. Biochemical Corp.) containing the genomic clone 
of the B- peru gene. Plasmid p35SBPcDNA * (further 
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designated as pXX002) is plasmid pMF6 (Goff et al, 1990, 
EMBO J. 9:2517-2522) containing the cDNA corresponding to 
the B-peru gene. Both plasmids were obtained from Dr. V. 
Chandler of the University of Oregon, Oregon, U.S.A. A 
5 2660 bp sequence of the genomic clone around the 

translation initiation codon was reported 
(EMBL/Genbank/DDBJ databases; locus name ZMBPERUA, 
Accession number X70791; see also Radicella et al, 1992, 
iSenes St Development 6:2152-2164). The sequence of the B- 

10 peru cDNA was also reported (Radicella et al, 1991, Plant 

Mol. Biol. 17:127- 130). 

Substantial amounts of 5* and 3" flanking sequences 
were deleted from pXX004, and the MluI-MunI fragment in 
the coding region of the genomic clone was replaced by 

15 the 1615 bp Mlul- Muni fragment of the cDNA clone. The 

resulting plasmid was designated as pC0L13 which was 
deposited at the Belgian Coordinated Collection of 
Microorganisms - LMBP Collection, Laboratory Molecular 
Biology, University of Ghent, K.L. Ledeganckstraat 35, B-^ 

20 9000 Ghent, Belgium and was given the Accession Number 

LMBP 3041. A shortened but functional B-peru gene is 
contained in pC0L13 as an EcoRI-Sall fragment with an 
approximate length of 4 Icbp (see SEQ ID NO 6) . 

25 2.3. Combining the CI and B-oeru genes 

The CI gene in pC0L9 and the B-peru gene in pCOLlB 
were then combined as follows. The 4 kbp EcoRI-Sall 
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fragment of pC0L13 was introduced between the EcoRI and 
Sail sites of the vector pBluescript II SK(-) 
(Stratagene) , yielding #7 B SK(-) . pC0L9 was digested 
with Sfil, treated with Klenow to fill in protruding 
ends, and further digested with EcoRI. The 1978 bp 
Sfil (Klenow) /EcoRI was then introduced between the EcoRI 
and Smal sites of #7 B SK(-) , yielding #7 B+C SK(-) . 
Finally the Xhol site in the CI sequence was removed as 
follows* The 950 bp EcoRI-SacII fragment of #7 B SK(-) 
(EcoRI site corresponding to the EcoRI site at position 
1506 in SEQ ID NO 1; the SacII site from the pBluescript 
linker) was introduced between the EcoRI and SacII sites 
of the Phagescript Vector (Stratagene) to yield pC0L21. 
Single strands of pC0L21 were prepared and hybridized to 
the following synthetic oligonucle;Otide 1 (SEQ ID No. 4): 

5'-CGT TTC TCG AAT CCG ACQ AGG-3 ' . 
resulting in a silent change (CTCGAG -> CTCGAA) and 
removal of the Xhol site. 

The 710 Aatll-SacII fragment of #7 B SK(-) was then 
exchanged for the corresponding Aatll-SacII fragment of 
the mutated pC0L21, yielding pCOL23. 

pG0L23 was then linearized with SacII, treated with 
Klenow, and ligated to Xhol linker sequence (Stratagene) , 
yielding pCOL24. 

Using the same procedure as described above, the 
shortened Cl-S gene of pC0L9S is combined with the 
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shortened B-peru gene of pCOL23, yielding plasmid 
PCOL24S. 

2.4. Construction of vectors comprising the CI and B-peru 
genes as well as Tnale-sterilitv gene and a selectable 
marker gene 

pTS256 is derived from pUC19 and contains the 
following two chimeric genes :1) P35S-bar-3 'nos, and 2) 
PTA29" barstar - 3'nos, i.e. a DNA coding for barstar of 
Bacillus amvloliguefaciens ( barstar or bar* ) operably 
linked to PTA29 and 3'nos. The complete sequence of 
pTS256 is given in SEQ ID NO 2. 

pTS200 is derived from pUC19 and contains the 
following two chimeric genes : 1) P35S-bar-3 'nos, and 2) 
PCA5 5 -barstar- 3'nos, i.e. barstar operably linked to the 
stamen-specific promoter PCA55 of Zea mays and 3'nos. The 
complete sequence of pTS200 is given in SEQ ID NO 3. 

pTS256 was modified by the inclusion of NotI linkers 
(Stratagene) in both the unique Sspl and Smal sites, 
yielding pTS256NN. The shorter BspEI-SacII fragment of 
PTS256NN was then replaced by the shorter BspEI-SacII 
fragment of pTS200, yielding pTS256+200. 

PTS256NN contains P35S-bar3 '-nos and pTA29-barstar3 »nos 
on a NotI cassette. pTS256NN+200 contains P35S- bar 3 '-nos 
and pCA55- barstar 3 'nos on a NotI cassette. 

The NotI cassette of pTS256NN was introduced in the 
NotI site of pCOL24, yielding pCOL25 and pCOL26 which 
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differ with respect to the orientation of the P35S-bar3'- 
nos gene with respect to the shortened CI gene (Figure 
2). 

The NotI cassette of pTS256NN+200 was introduced in 
the NotI site of pCOL24, yielding pCOL27 and pCOL28 which 
differ with respect to the orientation of the P35S-bar3'- 
nos gene with respect to the shortened CI gene (Figure 
2). 

Plasmids pCOL25, pCOL26, pCOL27 or pCOL28 contain a 
color- linked restorer gene Rf and a selectable marker 
gene (P35S-bar- 3"nos). Rf comprises the shortened CI and 
B-peru genes and a chimeric barstar gene (either PTA29- 
barstar-3 'nos or PCA55- barstar-3 'nos) . 

Plasmids pC0L25S, pC0L26S, pC0L27S or pCOL28S, 
containing the shortened Cl-S gene instead of the 
shortened CI gene, are obtained in a similar way using 
pCOL24S instead of pC0L24. 

2.5. Construction of vectors comprising the CI and B-peru 
genes as well as male-sterilitv gene 

Plasmid pTS59 can be obtained from plasmid pTS256 (of SEQ 
ID NO 2) by replacing the fragment extending from 
positions 1 to 1470 (comprising the chimeric gene P35S- 
fear-3'nos) with the sequence TATGATA. Then NotI linkers 
(Stratagene) were introduced in the EcoRV and Smal sites 
of pTS59; yielding pTS59NN, Finally the NotI fragment 
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comprising the chimeric gene PTA29" barstar -3 'nos was 
introduced in the NotI site of #7 B+C SK(-) , yielding 
pCOLlOO (the general structure of pCOLlOO and pDEllO is 
also presented in Figure 2) . 

2.6, Expression of shortened CI and B-peru in aleurone in 
corn seeds 

Dry seeds were incubated overnight in water at room 
temperature and were then peeled and sliced in half. Four 
to six half kernels were placed with the cut side on' wet 
filter paper and were bombarded with tungsten particles 
(diameter 0.7 jixm) which were coated with DNA. 
Particle bombardment was essentially carried out using 
the particle gun and procedures as described by Zumbrunn 
et al, 1989, Technique, 1:204-216. The tissue was placed 
at 10 cm from the stopping plate while a 100 /xm mesh was 
placed at 5 cm from the stopping plate. ' 

- DNA of the following plasmids was used : 

- pXX002 : B-peru cDNA under control of the 35S promoter 

- pXX201 : CI cDNA under control of the 35S promoter 

- pC0L13 : shortened B-peru gene as described in Example 2.2 

- pC0L12 : shortened CI gene as described in Example 2.1 
" pCOLlOO : shortened B-peru and shortened CI and PTA29- 

barstar-3 'nos as described in Example 2.5. 
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After bombardment the tissue was incubated for 2 days on 
wet filter paper at 27 ''C and was. then checked for the 
presence of red spots indicating anthocyanin production. 
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Note : + indicates that anthocyanin production was 



observed in at least one experiment; - indicates that no 
anthocyanin production was observed, nt = not tested. 

The results for three public lines {H99, Pa91, B73) 
and 9 different, coitimercially important, proprietary 
inbred line^ from various sources are shown in Table !• 
The line c-ruq is a tester line which is homozygous for a 
Cl allele that is inactivated by insertion of a receptor 
for the regulator Uq (Cormack et al., 1988, Crop Sci. 
28:941-944). 
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All lines which were r and cl produced anthocyanin in 
the aleurone after introduction with both a functional B- 
peru and Cl gene. Lines which were R and cl produced 
anthocyanin upon introduction of a functional Cl gene. 
Lines which were r and Cl produced anthocyanin upon 
introduction of a functional B-peru gene. This proves 
that the B-peru and Cl gene are sufficient for 
anthocyanin production in most corn lines. From the data 
in Table 1 it is also evident that even the shortened B- 
peru and Cl genes are still functional and are capable of 
producing anthocyanin in aleurone of corn lines with 
suitable genotypes. 

Example 3 ! Production of first parent corn plants by 
transformatio n of corn with the plasmids of example 1. 

Corn plants of line H99, transformed with a male- 
sterility gene comprising a DNA encoding barnase of 
Bacillus amvlol iauefaciens under control of the promoter 
of the TA29 gene of Nicotiana tabacum have been 
described in WO 92/09696. The transformed plants were 
shown to be male-sterile. 

Px^mple 4 : Produc tion of second parent corn plants by 

transformation of corn with the plasmids of examples 2, 

Corn inbred lines H99 and Pa91 are transformed using 
the procedures as described in WO 92/09696 but using 
plasmids pC0L25, pCOL26, pCOL27 or pC0L28 of Example 2. 
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Regenerated plants are selected that are male fertile and 
in which the shortened CI, the shortened B-peru gene, the 
P35S-bar-3 'nos gene, and the PTA29- barstar-3 ' nos (or 
PCA55-barstar-3 'nos) are expressed. 

Alternatively the male-sterile plants of Example 3 
(already containing the S gene) can be transformed with 
plasmids pCOL25, pCOL26, pCOL27 or pCOL28 of Example 2 on 
the condition that the S and Rf genes are linked to 
different selectable marker genes. 

Similarly, transformed corn plants are obtained using 
plasmids pC0L25S, pCOL26S, pCOL27S or pCOL28S of Example 
2. 

^In an alternative set of experiments the second 
parent plants of this invention were obtained by 
transforming corn plants of line H99, Pa91, and 
(Pa91xH99)x H99 with two separate plasmids one of which 
contained the color linked restorer gene (pCOLlOO) , while 
the other contains an appropriate selectable marker gene 
such as a chimeric bar gene (pDEllO) (alternatively a 
chimeric ne'o gene may also be used) . pDEllO was described 
in WO 92/09696 and the construction of pCOLlOO was 
described in Example 2.5. 

In yet another set of experiments the second parent 
plants of this invention are obtained by transforming 
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corn plants with a purified fragment of the plasmids of 
example 2.4. Such purified fragment is obtained by 
digestion of the plasmids of example 2,4 with Xhol and 
subsequent purification using conventional procedures 
such as gel filtration. 

Untransf ormed corn plants of lines H99 or Pa91 are 
detasseled and pollinated with pollen of the plants 
transformed with the Rf DNA. It is observed that the f 
gene segregates in a Mendelian way and that the seed that 
is harvested from these plants is colored and non-colored 
(yellow) in a 1:1 ratio. The red color of the seeds is 
correlated with the presence of the Rf gene. 

Example 5 ; The production of the first and second parent 
plants of this invention. 

First parent plants and second parent plants (i.e. 
maintainer plants) according to the invention are 
produced along the lines set out in Figure 1. 

The male-sterile plants of step 1 are those produced 
in Example 1. The corn plants transformed with the 
color-linked restorer gene of step 2 are those produced 
in Example 4. 

A plant of Example 1 and a plant of Example 4 are 
crossed (Step 3) and the progeny plants with the genotype 
S/-, Rf/- are selected (Step 4), e.g. by demonstrating 
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the presence of both the S and Rf genes in the nuclear 
genome (e.g. by means of PGR) . 

^The plants selected in Step 4 are then crossed with 
the male-sterile plants with genotype S/- (Step 5). The 
colored seeds (i.e. those containing the Rf gene) are 
selected, grown into plants, and examined for the 
presence of both the S and Rf genes (e.g. by PGR) . The 
plants containing both the S and Rf genes are selfed and 
the seeds of each plant are examined on seed color (red 
or yellow) . From the progeny of the self ings the non- 
colored seeds are grown into plants (step 6). The progeny 
of the self ings in which all noncolored seeds grow into 
male-sterile plants are retained (Step 6). These male- 
sterile plants are all homozygous for the S gene and are 
crossed with their fertile siblings {of genotype 
S/S,Rf/Rf or S/S,Rf/-) (Step 7). For some crossings the 
seeds harvested from the male-sterile plants are 50% 
colored and 50% non-colored (step 7) . The colored seeds 
all grow into fertile corn plants of genotype S/S,Rf/- 
which are the maintainer plants, or the second parent 
plants, of the present invention. The noncolored seeds 
all grow into male^sterile plants of the genotype S/S,-/* 
which are the first parent plants of this invention (Step 
7). 

The first and second parent plants are crossed and 
the seeds harvested from the male-sterile plants are 
separated on the basis of color (Step 8). All colored 
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noncolored seeds grow in first parent plants, thereby 
establishing an easy maintenance of a pure male-sterile 
line of corn. 

If the plant DNA that is flanking the S gene in the 
plants of Example 1 has been characterized, the progeny 
of the cross in Step 5 with genotypes S/S,-/" and S/S,R/- 
can be easily identified by means of PGR using probes 
corresponding to the flanking plant DNA. In this way Step 
6 can be skipped because the plants of Step 5 which grow 
from colored seeds (genotype S/S, Rf/-) can be crossed 
directly to plants with genotype S/S,-/- (as in Step 7). 

All publications cited in this application are hereby 
incorporated by reference. 

Example 6 ! Maintainer plants containing a color-linket^ 
restore r gene comprising the B-Peru coding region under 
control of the promoter of the Cl-S gene. 
Using conventional techniques a chimeric gene is inserted 
between the EcoRI and Hindlll sites of the polylinker of 
plasmid pUC19. The chimeric gene comprises the following 
elements in sequence: 

i) the promoter region of the Cl-S gene, i.e. the DNA 
fragment with the sequence of SEQ ID No. 1 from 
nucleotide positions 447 up to 1076 but containing at 
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nucleotide positions 935-939 the sequence TTAGG instead 
of TGCAG. 

ii) a single C nucleotide 

iii) the coding region and 3 'untranslated region of 
the B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No. 7 from nucleotide positions 576 up to 4137. 

This plasmid (designated as pLH52) , together with plasmid 
pC0L9S of Example 2 (comprising a Cl-S gene) and pTS256 
of SEQ ID No. 2 (comprising the following chimeric genes: 
P3 5S^bar-3 'nos and PTA29-barstar-3 'nos) , is used to 
transform corn essentially as described in Example 4. The 
transformed plants are then used to obtain second parent 
plants as described in Example 5. 

Example 7: Maintainer plantr. coritainina a color-linked 
restorer gene comprising the B-Peru coding region under 
control of the 3 5S -promoter. 

Using conventional techniques a chimeric gene is inserted 
between the EcoRI and Hindlll sites of the polylinker of 
plasmid pUC19. the chimeric gene comprises the following 
elements in sequence: 

i) The promoter region of the 35S promoter, i.e. the 
DNA fragment of pDEllO which essentially has the sequence 
as described in SEQ ID No. 4 of WO 92/09696 (which is 
incorporated herein by reference) from nucleotide 
positions 396 up to 1779 
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ii) the coding region and 3 'untranslated region of 
the B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No, 7 from nucleotide positions 576 up to 4137, 

This plasmid (designated as pP35S-Bp) , together with 
plasmid pC0L9S of Example 2 (comprising a Cl-S gene) and 
pTS256 of SEQ ID No! 2 (comprising , the following 'chimeric 
genes: P35S-bar-3 'nos and PTA-29-barstar-3 'nos) , is used 
to transform corn essentially as described in Example 4. 
The transformed plants are then used to obtain second 
parent plants as described in Example 5. 

Alternatively plasmid p35SBperu as described in Gof f et 
al, 1990, EMBO 9:2517-2522 is used instead of pP35SBp. 

Example 8: Maintainer plants containing a color-linked 
restorer aenf ^ comprising the maize P gene coding recfion 
under the con trol of the promoter of the Ci-S aene. 
Using conventional techniques a chimeric gene is inserted 
in the EcoRl site of the polylinker of plasmid pUC19. The 
chimeric gene comprises the following elements in 
sequence: 

i) the promoter region of the Cl-S gene, i.e. the DNA 
fragment with the sequence of SEQ ID No. 1 for nucleotide 
positions 447 up to 1076 but containing at nucleotide 
positions 935-939 the sequence TTAGG instead of TGCAG ; 
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ii) a single C nucleotide; 

iii) a DNA sequence comprising the coding region and 
3 'end untranslated region of the maize P gene as 
described by Grotewold et al in 1991, PNAS 88:4587-4591 
(nucleotides 320-1517). The maize P gene is an 
anthocyanin regulatory gene which specifies red 
phlobaphene pigmentation, a flavonoid pigment involved in 
the biosynthetic pathway of anthocyanin. In fact, the 
protein encoded by the P gene activates, among others, 
the Al gene required for both anthocyanin and phlobaphene 
pigmentation. Two cDNA clones have been isolated and 
sequenced by Grotewold et al and are described in the 
publication referred to above. It is the longer cDNA 
which is. of particular interest for construction of this 
chimeric gene. However, alternatively, the coding region 
of the shorter transcript can also be used in this 
chimeric gene, as well as the P gene leader sequence 
instead of the CI-S gene leader sequence. The P gene does 
not require a functional R or B gene to produce 
pigmentation. The visible pigment that is produced in the 
seeds of the maintainer plants is phlobaphene, a 
flavonoid pigment (like anthocyanin) directly involved in 
anthocyanin biosynthesis. 

iv) a DNA fragment containing the polyadenylation 
signal of the nopaline synthase gene of Aarobacterium 
tumefaciens, i.e. the DNA fragment with the sequence of 
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SEQ ID. No. 2 from nucleotide position 1600 up to 
nucleotide position 2909. 

The resulting plasiuid (designated as pPCSl-P) , together 
5 with pTS256 of SEQ ID No. 2 is used to transform com 

essentially as described in example 4 . The transformed 
plants are then used to obtain second parent plants as 
described -in example 5. 

10 Example 9: Maintainer plants containing a color-linked 

restorer gene comprising the B-peru coding region under 
. the control of the B-peru promoter . 
Using conventional techniques a chimeric gene is inserted 
between the EcoRl and the Hindlll sites of the polylinker 

15 of plasraid pUC19. The chimeric gene comprises the 

following elements in sequence: 

i) the promoter of the B-peru gene, i.e. a 1952 bp 
DNA sequence as disclosed in the EMBL databank under 
accession number X70791; 

20 ii) the coding region and 3 "untranslated region of 

the .B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No. 7 from nucleotide position 576 up to 4137. 
This plasmid (designated aspCOLll) , together with plasmid 
pCOL 9S of example 2 (comprising a Cl-S gene) and pTS256 

25 of SEQ ID No. 2 (comprising the following chimeric genes: 

P35S-bar-3 'nos and PTA29-barstar-3 'nos) is used to 
transform corn essentially as described in example 4. The 



CONnRWATlON COPY 



wo 95/34634 PCT/EP95/02157 

62 

transformed plants are then used to obtain second parent 
plants as described in example 5. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: PLANT GENETIC SYSTEMS N.V. 

(B) STREET: Jozef Plateaustraat 22 

(C) CITY: Ghent 

(E) COUNTRY: Belgium 

(F) POSTAL CODE (ZIP) : 9000 

(G) TELEPHONE: 32 9 235 84 iT 

(H) TELEFAX: 32 9 224 06 94 

(I) TELEX: 11.361 Pgsgen 

(ii) TITLE OF INVENTION: Use of anthocyanin genes to maintain 
male-sterile plants 

(iii) NUMBER OF SEQUENCES: 7 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release 1*1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/254,776 

(B) FILING DATE: 06-JUN-1994 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: CI gene of Zea mays 

(ix) FEATURE: 

(A) NAME/KEY: - ' 

(B) LOCATION: 279. ,284 

(D) OTHER INFORMATION :/label= Hpal 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 447. .452 

(D) OTHER INFORMATION: /label= EcoRI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(Bl LOCATION: 1735. .174 0 

(D) OTHER INFORMATION :/label= Aatll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1505. .1510 

(D) OTHER INFORMATION :/label= EcoRI 



SUBSTITUTE SHEET (RULE 26) 



wo 95/34634 



64 



PCT/EP95/02157 



(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2081. .2086 

(D) OTHER INFORMATION :/label= Xhol 



(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION:2418. .2430. 

(D) OTHER INFORMATION: /label= Sfil 



(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2669. .2674 

(D) OTHER INFORMATION :/label= SnaBI 

(ix) FEATURE: . 

(A) NAME/KEY: - ' 

(B) LOCATION: 2634. .2639 

(D) OTHER INFORMATION :/label= SnaBI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3008. .3013 

(D) OTHER INFORMATION :/label= Hpal 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .1077 

(D) OTHER INFORMATION: /lab€l= PCI 

/note= "region containing promoter of CI gene" 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION; 1078. .2134 
(D) OTHER INFORMATION: /label= CI 

/note= "coding region of CI gene" 

FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2135. .2430 
(D) OTHER INFORMATION :/label= 3 'CI 

/note= "region containing polyadenylation signal of CI 



(ix) FEATURE: 

(A) NAME/ KEY: 

(B) LOCATION: 1033. .1038 

(D). OTHER INFORMATION :/label= TATA-Box 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1061. .1062 

(D) OTHER INFORMATION: /label= transcript-init 
/note- "transcription initiation site" 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 1211. .1299 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 1430. .1575 



(ix) 



gene" 
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(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 935. .939 

(D) OTHER INFORMATION: /label= Cl-S 

/note= "TGCAG sequence (in CI gene) which in the CI 
sequence is changed to TTAGG" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TATCAACCTC CTGTGTTATT TTTAGTGACG GTTTCTTAAA AAACACCACT AGAAATCGTA 60 

TTTTTATAGG TGGTTCCTTA AGAAAACTGC ATGCAGAAAT CCATGACGGT TTTCTTAAGG 120 

AACCGTATGT AGAAATACGA TTTCTAGTGA CGATCTTCTT AAGGAAACCA CCACTAAAAA 180 

TTATTTTTAT CCTTAATTTT CGAGTTTTTC AAACGATCTC GTATGATGAA ACCATCAAAA 240 

TAAAAGTTGT ACATCTCTAA AAGTTATGAA AAtTTGTAGT TAACAACTTT TTTATTTGAA 300 

CTCATTTTGG TTCTCAAAAA TTGCATCTAA ATTTGTCAAA TTTAAAATTC AAATTTTCCA 360 

AACGACCTCG GATGAAAAAA GTGTCAAAAT GAAAGTTGTA GAACTTCAAA AGTTATTCAA 420 . 

CTTTGTAGTC GACTATCTTT TTATTTGAAT TCGCTTACGG TCTCAAACAA GCAATTTACA 480 

CTCAGTTGGT TGTAATATGT GGACAATAAA ACTACAAACT AGACACAAAT CATACCATAG 540 

ACGGAGTGGT AGCAGAGGGT ACGCGCGAGG GTGAGATAGA GGATTCTCCT AAAATAAATG 600 

CACTTTAGAT GGGTAGGGTG GGGTGAGGCC TCTCCTAAAA TGAAACTCGT TTAATGTTTC 660 

TAAAAATAGT TTTCACTGGT GATCCTTAGT TACTGGCATG TAAAAATGAT GATTTCTACT 720 

GTCTCTCATA TGGACGGTTA TAAAAAATAC CATTATATTG AAAATAGGtC TCTGCTGCTA 780 

CACTCGCCCT CATAGCAGAT CATGCATGCA CGCATCATTC GATCAGTTTT CGTTCTGATG 840 

CAGTTTTCGA TAAATGCCAA TTTTTTAACT GCATACGTTG CCCTTGCTCA GCACCAGCAC 900 

AGCAGTGTCG TGTCGTCCAT GCATGCACTT TAGGTGCAGT GCAGGGCCTC AACTCGGCCA 960 

CGTAGTTAGC GCCACTGCTA CAGATCGAGG CACCGGTCAG CCGGCCACGC ACGTCGACCG 1020 

CGCGCGTGCA TTTAAATACG CCGACGACGG AGCTTGATCG ACGAGAGAGC GAGCGCGATG 1080 

GGGAGGAGGG CGTGTTGCGC GAAGGAAGGC GTTAAGAGAG GGGCGTGGAC GAGCAAGGAG 1140 

GACGATGCCT TGGCCGCCtA CGTCAAGGCC CATGGCGAAG GCAAATGGAG GGAAGTGCCC 1200 

CAGAAAGCCG GTAAAACTAG . CTAGTCTTTT TATTTCATTT TGGGATCATA TATATACCCC 1260 - 

CGAGGCAAGA CCGGAGGACG ATCACGTGTG TGGGTGCAGG TTTGCGTCGG TGCGGCAAGA 1320 

GCTGCCGGCT GCGGTGGCTG AACTACCTCC GGCCCAACAT CAGGCGCGGC AACATCTCCT 1380 

ACGACGAGGA GGATCTCATC ATCCGCCTCC ACAGGCTCCT CGGCAACAGG TCTGTGCAGT 1440 

GGCCAGTGGT GGGCTAGCTT ATTACACGAG CTGACGACGA GGCGATCGAT CGAGCGTCTG 1500 

CTGCGAATTC ATCTGTTCCG GTGTCGGCCG TGTGAGAGTG AGCTCATTCA TATGTACATG 1560 
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CGTGTTGGCG 


CGCAGGTGGT 


CGCTGATTGC 


AGGCAGGCTG 


CCTGGCeGAA 


CAGACAATGA 


1620 


AATCAAGAAC 


TACTGGAACA 


GCACGCTGGG 


CCGGAGGGCA 


GGCGCCGGCG 


CCGGCGCCGG 


1680 


CGGCAGCTGG 


GTCGTCGTCG 


CGCCGGACAC 


CGGCTCGCAC 


GCCACCCCGG 


CCGCGACGTC 


1740 


GGGCGCCTGC 


GAGACCGGCC 


AGAATAGCGC 


CGCTCATCGC 


GCGGACCCCG 


ACTCAGCCGG 


1800 


GACGACGACG 


ACCTCGGCGG 


CGGCGGTGTG 


GGCGCCCAAG 


GCCGTGCGGT 


GCACGGGCGG 


1860 


ACTCTTCTTC 


TTCCACCGGG 


ACACGACGCC 


GGCGCACGCG 


GGCGAGACGG 


CGACGCCAAT 


1920 


GGCCGGTGGA 


GGTGGAGGAG 


GAGGAGGAGA AGGAGGGTCG 


TCGGACGACT 


GCAGCTCGGC 


1980 


GGCGTCGGTA 


TCGCTTCGCG 


TCGGAAGCCA 


CGACGAGCCG 


TGCTTCTCCG 


GCGACGGTGA 


2040 


CGGCGACTGG 


ATGGACGACG 


TGAGGGCCCT 


GGCGTCGTTT 


CTCGAGTCCG 


ACGAGGACTG 


2100 


GCTCCGCTGT 


CAGACGGCCG 


GGCAGCTTGC 


GTAGACAACA 


AGTACACGTA 


TAGATGTCCA 


2160 


ATAAGCACGA 


GGCCCGCGAG 


CCCGGCACGA 


AGCCCGCTTT 


TTGGGCCCGG 


TCCGAGCCCG 


2220 


GCACGGCCCG 


GTTATATGCA 


GACCCGGGCC 


GGCGCGGCAC 


GAATAAGCGG 


GCCGGGCTCG 


2280 


GACAGGAAAT 


TAGGCACGGT 


GAGCTAGCCC 


GGCACGGCCC 


GTTTAGGTCT 


AAGCCCGTTA 


2340 


AGCCCGTTTT 


TTTACACTAA AACGTGCTTC 


TCGGCCCGCA 


TAGCCCGCTT 


CTCGGCCCGC* 


2400 


TTTTTTCGTG 


CTAAACGGGC 


CGGCCCGGCC 


CGGTTTAGGC 


CCGTTGCGGG 


CCGGGCTCGG 


2460 


ACAGGAAATT 


GAGCCCGCGT 


GCTTAGCCGG . CCCGGCCCGG TTTTTTAATC 


GTGCCTGGCG 


2520 


GGCCAGGCCC 


AAAACGGGCC 


GGGCTTCACC 


GGGCCCGGGC 


CGGACCGGGC 


CGGGCGGCCG 


2580 


GTTTGGACAT 


CTCTAAGTAC 


ACGTATGGAG 


GAGAATATAT 


ATATAGTCAT 


GCGTACGTAT 


2640 


AGATTTTTTC 


ATCCGATCCC 


AACAGAAATA 


CGTATGAAAA 


TGCTCTTCGT 


TCTTTTTCAT 


2700 


TTATCATATC 


TATACTATAC 


TTAAAACACC 


AGTTTCAACG 


GTCGTCATGC 


GTCATTTTTT 


2760 


TACAAATAAC 


CCCTCACAGC 


TATTTCAAAT 


TAATCCGCTG 


CACGTCTATA 


GATGCCAAAC 


2820 


GACGCCCAAC 


ACGGGCTAGA 


TGCACGCGGG 


CCACAACTAT 


GGCACAGGCA 


CGTCATGCCG 


2880 


GCCTGCTAAC 


TGTGTCGGGC 


TAGCCCGTTA 


GCCCGTCGAT 


CCATTTAATT 


AAATTAGCGT 


2940 


AACGACGCCC 


GACACGGGCT AGATGCACGT GGGCCACAAC TATGGCACAT 


GCACGTCATG 


3000 


CCGGCCTGTT 


AACTGTGTCG 


GGCCAGTCTG 


TTAGCCCATT 


GATCCATTTA ATTAAATCAG' 


3060 


C6TAAAATGT 


TAAAAACGGT 


GCAGGAGGTG 


GGGTTCGAAC 


CCATACCCTG 


ATGGAAGAAG 


3120 


GGCGGGAGAC 


ACTGGGTGAA 


ACTGTCTAAC 


CAGTAGAATA 


TCTATCACGC 


TAAGATGTTT 


3180 


TTAATATTGA ATATAAATTG 


TATATAAGCA 


TATAAGTTTT 


TTTGTAAAAT 


AAAAAATAAT 


3240 


CGTGTCGGGC 


CG6GCCATCA 


CTACTGGCCG 


AGGCTACAAC 


CCAAGCACGA 


CACGACGTTC 


3300 


TTGGCTCTTG 


CAAGCATTAG 


GTCGTTTCTG 


AGACCATATT 


GGCGCAATGG 


ACTACATGAT 


3360 


.GTTTGGGGTT 


GCTGAATTGA 


ATGGAGCAGC 


AATAATTTGT 


CACACTAACA 


GCAAAATGAA 


3420 


AGGTTATTTG 


TTGGTTTTAA ACGTTAGTAA TTGCTACGAA 


GTAGCATAAT 


TTATATGGAG 


3480 
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CGCATCCAGT 


TTTTATTGAT 


GCCTGACTTT 


AGCAAl tJAC i 




ATrTATCTTT 


3540 


TTTATAAGTT 


TGACTTCATG 


GGACTTATTT 


T AG AAC TT G A 




TTTPTPTTAT 


3600 


TTTGTCTCTA 


TATGATGAAA 


TTGTGTCATT 


TTATAATCTT 


lull CAl 1 




3660 


GTGAACTCTC 


TTCTAATCAC 


TCACTTCATT 


AGl 1 (j I u 1 i tj 


•p a p p 21 2x r* A 


TAT T T G CAT A 


3720 


GAGTAAACAA 


TAACATCAGT 


TAGCCAAATC 


AAAAAATATA 


TTATACAGAG 


AGCGGAGACA 


3780 


ATCAAATAAA 


AAATCTTGAA 


ATTTTTTTAA 


TGGATAGTTT 


ACGTGGGTAT 


TGTTGTAAGC 


3840 


CGTCGCAACG 


CACGGGCAAC 


CGACTAGTTT 


TAGTTTATAA 


ATTAATAAAC 


GTACGACAAA 


3900 


TATTAAGAAC 


GCCACCTTTC 


CATGCCTACG 


CGCGCGTGAG 


ACACGACCGG 


GGCACGTCAG 


3960 


ACGTGTGCCC 


CTGTTGTATA 


ATTTATTTAC 


TTTTTAATGA 


CTATGTGCTG 


TTGGTTGCCG 


4020 


TTGGCTTCAT 


CGTGTTCGTA 


GCCATGCATA 


AATCCAGCG 






4059 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: plasmid pTS256, linearized . at Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: complement (39.. 317) 
(D) OTHER INFORMATION :/label= 3'nos 

/note= "3' regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: complement (318. ,869) 
(D) OTHER INFORMATION: /label= bar 

/note= "coding region of bar gene of Streptomyces hygroscopicus" 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: complement (870.. 1702) 
(D) OTHER INFORMATION: /label= P35S 

/note= "35S promoter of Cauliflower Mosaic Virus" 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1740. .2284 

(D) OTHER INFORMATION: /label= PTA29 

/not€= "promoter of TA29 gene of Nicotiana tabacum" 
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(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2285. .2557 

(D) OTHER INFORMATION :/label= bars tar 

/note= "coding region of barstar gene of Bacillusamyloliquef acien 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:2558. .2879 

(D) OTHER INFORMATION: /label= 3*nos 

/note= "3* regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:!. .38 

(D) OTHER INFORMATION :/label= pUC19 
/note= "pUC19 derived sequence" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:2880. .4896 

(D) OTHER INFORMATION :/label= pUC19 
/note= "pUC19 derived sequence" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3004. .3009 

(D) OTHER INFORMATION :/label= EcoRI 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



AGCTTGCATG 


CCTGCAGGTC 


GACTCTAGAG 


GATCTTCCCG 


ATCTAGTAAC 


ATAGATGACA 


60 


CCGCGCGCGA 


TAATTTATCC 


TAGTTTGCGC 


GCTATATTTT 


GTTTTCTATC 


GCGTATTAAA 


120 


TGTATAATTG 


CGGGACTCTA 


ATCATAAAAA 


CCCATCTCAT 


AAATAACGTC 


ATGCATTACA 


180 


TGTTAATTAT 


TACATGCTTA 


ACGTAATTCA 


ACAGAAATTA 


TATGATAATC 


ATCGCAAGAC 


240 


CGGCAACAGG 


ATTCAATCTT 


AAGAAACTTT 


ATTGCCAAAT 


GTTTGAACGA 


TCTGCTTCGG 


300 


ATCCTAGACG 


CGTGAGATCA 


GATCTCGGTG 


ACGGGCAGGA 


CCGGACGGGG 


CGGTACCGGC 


360 


AGGCTGAAGT 


CCAGCTGCCA 


GAAACCCACG 


TCATGCCAGT 


TCCCGTGCTT 


GAAGCCGGCC 


420 


GCCCGCAGCA 


TGCCGCGGGG 


GGCATATCCG 


AGCGCCTCGT 


GCATGCGCAC 


GCTCGGGTCG 


480 


TTGGGCAGCC 


CGATGACAGC 


GACCACGCTC 


TTGAAGCCCT 


GTGCCTCCAG 


GGACTTCAGC 


540 


AGGTGGGTGT 


AGAGCGTGGA 


GCCCAGTCCC 


GTCCGCTGGT 


GGCGGGGGGA 


GACGTACACG 


600 


GTCGACTCGG 


CCGTCCAGTC 


GTAGGCGTTG 


CGTGCCTTCC 


AGGGGCCCGC 


GTAGGCGATG 


660 


CCGGCGACCT 


CGCCGTCCAC 


CTCGGCGACG 


AGCCAGGGAT 


AGCGCTCCCG 


CAGACGGACG 


720 


AGGTCGTCCG 


TCCACTCCTG 


CGGTTCCTGC 


GGCTCGGTAC 


GGAAGTTGAC 


CGTGCTTGTC 


780 


TCGATGTAGT 


GGTTGACGAT 


GGTGCAGACC 


GCCGGCATGT 


CCGCCTCGGT 


GGCACGGCGG 


840 


ATGTCGGCCG 


GGCGTCGTTC 


TGGGTCCATG 


GTTATAGAGA 


GAGAGATAGA 


TTTATAGAGA 


900 
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GAGACTGGTG ATTTCAGCGT GTCCTCTCCA AATGAAATGA ACTTCCTTAT ATAGAGGAAG 960 

GGTCTTGCGA AGGATAGTGG GATTGTGCGT CATCCCTTAC GTCAGTGGAG ATGTCACATC 1020 

AATCCACTTG CTTTGAAGAC GTGGTTGGAA CGTCTTCTTT TTCCACGATG CTCCTCGTGG 1080 

GTGGGGGTCC ATCTTTGGGA CCACTGTCGG CAGAGGCATC TTGAATGATA GCCTTTCCTT 1140 

TATCGCAATG ATGGCATTTG TAGGAGCCAC CTTCCTTTTC TACTGTCCTT TCGATGAAGT 1200 

GACAGATAGC TGGGCAATGG AATCCGAGGA GGTTTCCCGA AATTATCCtT TGTTGAAAAG 1260 
TCTCAATAGC CCTTTGGTCT TCTGAGAGTG TATCTTTGAC ATTTTTGGAG TAGACCAGAG . 1320 

TGTCGTGCTC CACCATGTTG ACGAAGATTT TCTTCTTGTC ATTGAGTCGT AAAAGACTCT 1380 

GTATGAACTG TTCGCCAGTC TTCACGGCGA GTTCTGTTAG ATCCTCGATT TGAATCTTAG 1440 
ACTCCATGCA TGGCCTTAGA TTCAGTAGGA ACTACCTTTT TAGAGACTCC AATCTCTATT ' 1500 

ACTTGCCTTG GTTTATGAAG CAAGCCTTGA ATCGTCCATA CTGGAATAGT ACTTCTGATC 1560 

TTGAGAAATA TGTCTTTCTC TGTGTTCTTG ATGCAATTAG TCCTGAATCT TTTGACTGCA 1620 

TCTTTAACCT TCTTGGGAAG GTATTTGATC TCCTGGAGAT TGTTACTCGG GTAGATCGTC 1680 

TTGATGAGAC CTGCTGCGTA GGAGCTTGCA TGCCTGCAGG TCGACTCTAG AGGATCCCCA 174 0 

TCTAGCTAAG TATAACTGGA TAATTTGCAT TAACAGATTG AATATAGTGC CAAACAAGAA 1800 

GGGACAATTG ACTTGTCACT TTATGAAAGA TGATTCAAAC ATGATTTTTT ATGTACTAAT 1860 

ATATACATCC TACTCGAATT AAAGCGACAT AGGCTCGAAG TATGCACATT TAGCAATGTA 1920 

AATTAAATCA GTTTTTGAAT CAAGCTAAAA GCAGACTTGC ATAAGGTGGG TGGCTGGACT 1980 

AGAATAAACA TCTTCTCTAG CACAGCTTCA TAATGTAATT TCCATAACTG AAATCAGGGT 2040 

GAGACAAAAT ' TTTGGTACTT TTTCCTCACA CTAAGTCCAT GTTTGCAACA AATTAATACA 2100 

TGAAACCTTA ATGTTACCGT CAGATTAGCC TGCTACTCCC CATTTTCCTC GAAATGCTCC 2160 

AACAAAAGTT AGTTTTGCAA GTTGTTGTGT ATGTCTTGTG CTCTATATAT GCCCTTGTGG 2220 

TGCAAGTGTA ACAGTACAAC ATCATCACTC AAATC7WVGT TTTTACTTAA AGAAATTAGC 2280 

TACCATGAAA AAAGCAGTCA TTAACGGGGA ACAAATCAGA AGTATCAGCG ACCTCCACCA 2340 

GACATTGAAA AAGGAGCTTG CCCTTCCGGA ATACTACGGT GAAAACCTGG ACGCTTTATG 2400 

GGATTGTCTG ACCGGATGGG TGGAGTACCC GCTCGTTTTG GAATGGAGGC AGTTTGAACA 24 60 

AAGCAAGCAG CTGACTGAAA ATGGCGCCGA GAGTGTGCTT CAGGTTTTCC GTGAAGCGAA 2520 

AGCGGAAGGC TGCGACATCA CCATCATACT TTCTTAATAC GATCAATGGG AGATGAACAA 2580 

TATGGAAACA CAAACCCGCA AGCTTGGTCT AGAGGATCCG AAGCAGATCG TTCAAACATT 2640 

TGGCAATAAA GTTTCTTAAG ATTGAATCCT GTTGCCGGTC TTGCGATGAT TATCATATAA 2700 

TTTCTGTTGA ATTACGTTAA GCATGTAATA ATTAACATGT AATGCATGAC GTTATTTATG 2760 
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AGATGGGTTT TTATGATTAG 
ATATAGCGCG CAAACTAGGA 
GAAGATCCCC GGGTACCGAG 
TGCGTATTGG GCGCtCTTCC 
TGCGGCGAGC GGTATCAGCT 
ATAACGCAGG AAAGAAGATG 
CCGCGTTGCT GGCGTTTTTC 
GCTCAAGTCA GAGGTGGCGA 
GAAGCTCCCT CGTGCGCTCT 
TTCTCCCTTC GGGAAGCGf G 
TGTAGGTCGT TCGCTCCAAG 
GCGCCTTATC CGGTAACTAT 
TGGCAGCAGC CACTGGTAAC 
TCTTGAAGTG GTGGCCTAAC 
TGCTGAAGCC AGTTACCTTC 
CCGCTGGTAG CGGTGGTTTT 
CTCAAGAAGA TCCTTTGATC 
GTTAAGGGAT TTTGGTCATG 
ATTAAAAATG AAGTTTTAAA 
ACCAATGCTT AATCAGTGAG 
TTGCCTGACT CCCCGTCGTG 
GTGCTGCAAT GATACCGCGA 
AGCCAGCCGG AAGGGCCGAG 
CTATTAATTG TTGCCGGGAA 
TTGTTGCCAT TGCTACAGGC 
GCTCCGGTTC CCAACGATCA 
TTAGCTCCTT CGGTCCTCCG 
TGGTTATGGC AGCACTGCAT 
TGACTGGTGA GTACTCAACC 
CTTGCCCGGC GTCAATACGG 
TCATTGGAAA ACGTTCTTCG 
GTTCGATGTA ACCCACTCGT 



70 

AGTCCCGCAA TTATACATTT AATACGCGAT AGAAAACAAA 2820 

TAAATTATCG CGCGCGGTGT CATCTATGTT ACTAGATCGG 2880 

CTCGAATTCT GATCAGGCCA ACGCGCGGGG AGAGGCGGTT 2940 

GCTTCCTCGC TCACTGACTC GCTGCGCTCG GTCGTTCGGC 3000 

CACTCAAAGG CGGTAATACG GTTATGCACA GAATCAGGGG 3060 

TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG 3120 

CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC 3180 

AACCCGACAG GACTATAAAG ATACCAGGCG TTTCCCCCTG 3240 

CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT 3300 

GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG 3360 

CTGGGCTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT 3420 

CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC 34 80 

AGGATTAGCA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT 3540 

TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC 3600 

GGAAAAAGAG TTGGTAGCTC TTGATCCGGC AAACAAACCA 3660 

TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT 3720 

TTTTCTACGG GGTCTGACGC TCAGTGGAAC GAAAACTCAC 3780 

AGACTCGAGC CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 3840 

TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 3900 

GCACCTATCT CAGCGATCTG TCTATTTCGT TCATCCATAG 3960 

TAGATAACTA CGATACGGGA GGGCTTACCA TCTGGCCCCA 4020 

GACCCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC 4080 

CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT 4140 

GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCGCAACG 4200 

ATCGTGGTGT CACGCTCGTC GTTTGGTATG GCTTCATTCA 4260 

AGGCGAGTTA CATGATCCCC CATGTTGTGC AAAAAAGCGG 4320 

ATCGTTGTCA GAAGTAAGTT GGCCGCAGTG TTATCACTCA 4380 

AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG 4440 

AAGTCATTCT GAGAATAGTG TATGCGGCGA CCGAGTTGCT 4500 

GATAATACCG CGCCACATAG CAGAACTTTA AAAGTGCTCA 4560 

GGGCGAAAAC TCTCAAGGAT CTTACCGCTG TTGAGATCCA 4620 

GCACCCAACT GATCTTCAGC ATCTTTTACT TTCACCAGCG 4680 
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TTTCTGGGTG AGCAAAAACA GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC 474 0 

GGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 4800 

ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA ATAGGGGTTC 4860 

CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCA 4896 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3544 base pairs * 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-Hindlll region of plasmid pTS200 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION:3227..3504 

(D) OTHER INFORMATION: /labels 3'nos 

/note= "3' regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 

(ix) FEATURE: 

(A) NAME/KEY: - 

IB) LOCATION: 2675. .3226 

(D) OTHER INFORMATION :/label= bar , 

/note= "coding region of bar gene of Streptomyces hygroscopicus 

(ix) FEATURE: 

(A) NAME/KEY: - 

CB) LOCATION: 1841. .2674 

(D) OTHER INFORMATION :/label= P35S 

/note= "35S promoter of Cauliflower Mosaic Virus" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: complement (626.. 1803) 
(D) OTHER INFORMATION: /label= PCA55 

/note= "promoter of CA55 gene of Zea mays" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: complement (353., 625). 
(D) OTHER INFORMATION :/label= barstar 

/note= ''coding region of barstar gene of Bacillus 
amyloliquefaciens" 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: complement (30.. 352) 
(D) OTHER INFORMATION :/label= 3'nos 

/note= "3V regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 
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(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION:!. .6 

(D) OTHER INFORMATION: /labels EcoRI 

(ix) FEATURE: 

(A) NAME/ KEY: - 
.(B) LOCATION: 3539. .3544 
(D) OTHER INFORMATION :/label= Hindlll 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GAATTCGAGC TCGGTACCCG GGGATCTTCC CGATCTAGTA ACATAGATGA CACCGCGCGC 60 

GATAATTTAT CCTAGTTTGC GCGCTATATT TTGTTTTCTA TCGCGTATTA AATGTATAAT 120 

TGCGGGACTC TAATCATAAA ' AACCCATCTC ATAAATAACG TCATGCATTA CATGTTAATT 180 

ATTACATGCT TAACGTAATT CAACAGAAAT TATATGATAA TCATCGCAAG ACCGGCAACA 240 

GGATTCAATC TTAAGAAACT TTATTGCCAA ATGTTTGAAC GATCTGCTTC GGATCCTCTA , 300 

GACCAAGCTT GCGGGTTTGT GTTTCCATAT TGTTCATCTC CCATTGATCG TATTAAGAAA 360 

GTATGATGGT GATGTCGCAG CCTTCCGCTT TCGCTTCACG GAAAACCTGA AGCACACTCT 420 

CGGCGCCATT TTCAGTCAGC TGCTTGCTTT GTTCAAACTG CCTCCATTCC AAAACGAGCG 480 

GGTACTCCAC CCATCCGGTC AGAGAATCCG ATAAAGCGTC CAGGTTTTCA CCGTAGTATT 540 

CCGGAAGGGC AAGCTCCTTT TTCAATGTCT GGTGGAGGTC GCTGATACTT CTGATTTGTT 600 ' 

CCCCGTTAAT GACTGCTTTT TTCATGGCTG CAGCTAGTTA GCTCGATGTA TCTTCTGTAT 660 

ATGCAGTGCA GCTTCTGCGT TTTGGCTGCT TTGAGCTGTG AAATCTCGCT TTCCAGTCCC 720 

TGCGTGTTTT ATAGTGCTGT ACGTTCGTGA TCGTGAGCAA ACAGGGCGTG CCTCAACTAC 780 

TGGTTTGGTT GGGTGACAGG CGCCAACTAC GTGCTCGTAA CCGATCGAGT 6AGCGTAATG 840 

CAACATTTTT TCTTCTTCTC TCGCATTGGT TTCATCCAGC CAGGAGACCC GAATCGAATT 900 

GAAATCACAA ATCTGAGGTA CAGTATTTTT ACAGTACCGT TCGTTCGAAG GTCTTCGACA 960 

GGTCAAGGTA ACAAAATCAG TTTTAAATTG TTGTTTCAGA TCAAAGAAAA TTGAGATGAT 1020 

CTGAAGGACT TGGACCTTCG TCCAATGAAA CACTTGGACT AATTAGAGGT GAATTGAAAG 1080 

CAAGCAGATG CAACCGAAGG TGGTGAAAGT GGAGTTTCAG CATTGACGAC GAAAACCTTC 1140 

GAACGGTATA AAAAAGAAGC CGCAATTAAA CGAAGATTTG CCAAAAAGAT GCATCAACCA 1200 

AGGGAAGACG TGCATACATG TTTGATGAAA ACTCGTAAAA ACTGAAGTAC GATTCCCCAT 1260 

TCCCCTCCTT TTCTCGTTTC TTTTAACTGA AGCAAAGAAT TTGTATGTAT TCCCTCCATT 1320 

CCATATTCTA GGAGGTTTTG GCTTTTCATA CCCTCCTCCA TTTCAAATTA TTTGTCATAC 1380 

ATTGAAGATA TACACCATTC TAATTTATAC TAAATtACAG CTTTTAGATA CATATATTTT 1440 

ATTATACACT TAGATACGTA TTATATAAAA CACCTAATTT AAAATAAAAA ATTATATAAA 1500 

AAGTGTATCT AAAAAATCAA AATACGACAT AATTTGAAAC GGAGGGGTAC TACTTATGCA 1560 
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AACCAATCGT GGTAACCCTA AACCCTATAT GAATGAGGCC ATGATTGTAA TGCACCGTCT 1620 

GATTAACCAA GATATCAATG GTCAAAGATA TACATGATAC ATCCAAGTCA CAGCGAAGGC 1680 

AAATGTGACA ACAGTTTTTT TTACCAGAGG GACAAGGGAG AATATCTATT CAGATGTCAA 1740 

GTTCCCGTAT CACACTGCCA GGTCCTTACT CCAGACCATC TTCCGGCTCT ATTGATGCAT 1800 

ACCAGGAATT GATCTAGAGT CGACCTGCAG GCATGCAAGC TCCTACGCAG CAGGTCTCAT 1860 

CAAGACGATC TACCCGAGTA ACAATCTCCA GGAGATCAAA TACCTTCCCA AGAAGGTTAA 1920 

AGATGCAGTC AAAAGATTCA GGACTAATTG CATCAAGAAC ACAGAGAAAG ACATATTTCT 1980 

CAAGATCAGA AGTACTATTC CAGTATGGAC GATTCAAGGC TTGCTTCATA AACCAAGGCA 2040 

AGTAATAGAG ATTGGAGTCT CTAAAAAGGT AGTTCCTACT GAATCTAAGG CCATGCATGG 2100 

AGTCTAAGAT TCAAATCGAG GATCTAACAG AACTCGCCGT GAAGACTGGC GAACAGTTCA 2160 

TACAGAGTCT TTTACGACTC AATGACAAGA AGAAAATCTT CGTCAACATG GTGGAGCACG 2220 

ACACTCTGGT CTACTCCAAA AATGTCAAAG ATACAGTCTC AGAAGACCAA AGGGCTATTG 2280 

AGACTTTTCA ACAAAGGATA ATTTCGGGAA ACCTCCTCGG ATTCCATTGC CCAGCTATCT 2340 

GTCACTTCAT CGAAAGGACA GTAGAAAAGG AAGGTGGCTC CTACAAATGC CATCATTGCG 2400 

ATAAAGGAAA GGCTATCATT CAAGATGCCT CTGCCGACAG TGGTCCCAAA GATGGACCCC 2460 

CACCCACGAG GAGCATCGTG GAAAAAGAAG ACGTTCCAAC CACGTCTTCA AAGCAAGTGG 2520 

ATTGATGTGA CATCTCCACT GACGTAAGGG ATGACGCACA ATCCCACTAT CCTTCGCAAG 2580 

ACCCTTCCTC TATATAAGGA AGTTCATTTC ATTTGGAGAG GACACGCTGA AATCACCAGT 2640 

CTCTCTCTAT AAATCTATCT CTCTCTCTAT AACCATGGAC CCAGAACGAC GCCCGGCCGA 2700 

CATCCGCCGT GCCACCGAGG CGGACATGCC GGCGGTCTGC ACCATCGTCA ACCACTACAT 2760 

CGAGACAAGC ACGGTCAACT TCCGTACCGA GCCGCAGGAA CCGCAGGAGT GGACGGACGA 2820 

CCTCGTCCGT CTGCGGGAGC GCTATCCCTG GCTCGTCGCC GAGGTGGACG GCGAGGTCGC 2880 

CGGCATCGCC TACGCGGGCC CCTGGAAGGC ACGCAACGCC TACGACTGGA CGGCCGAGTC 2940 

GACCGTGTAC GTCTCCCCCC GCCACCAGCG GACGGGACTG GGCTCCACGC TCTACACCCA 3000 

CCTGCTGAAG TCCCTGGAGG CACAGGGCTT CAAGAGCGTG GTCGCTGTCA TCGGGCTGCC 3060 

CAACGACCCG AGCGTGCGCA TGCACGAGGC GCTCGGATAT GCCCCCCGCG GCATGCTGCG 3120 

GGCGGCCGGC TTCAAGCACG GGAACTGGCA TGACGTGGGT TTCTGGCAGC TGGACTTCAG 3180 

CCTGCCGGTA CCGCCCCGTC CGGTCCTGCC CGTCACCGAG ATCTGATCTC ACGCGTCTAG 3240 

GATCCGAAGC AGATCGTTCA AACATTTGGC AATAAAGTTT CTTAAGATTG AATCCTGTTG 3300 

CCGGTCTTGC GATGATTATC ATATAATTTC TGTTGAATTA CGTTAAGCAT GTAATAATTA 3360 

ACATGTAATG CATGACGTTA TTTATGAGAT GGGTTTTTAT GATTAGAGTC CCGCAATTAT 3420 
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ACATTTAATA CGCGATAGAA AACAAAATAT AGCGCGCAAA CTAGGATAAA TTATCGCGCG 3480 

CGGTGTCATC TATGTTACTA GATCGGGAAG ATCCTCTAGA GTCGACCTGC AGGCATGCAA 3540 
GCTT 3544 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: oligonucleotide 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CGTTTCTCGA ATCCGACGAG G 21 
(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4824 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGAlvISM: plasmid pC0L9 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 396. .401 

(D) OTHER INFORMATION :/label= EcoRI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2367. ,2379 

(D) OTHER INFORMATION :/label= Sfil 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 884. .888 

(D) OTHER INFORMATION :/label= Cl-S 

/note= "TGCAG (in. CI) which in Cl-S allele is replaced with 
TTAGG" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: .5: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA 60 

CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 120 

TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGC 180 
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ACCATATGCG GTGTGAAATA 
ATTCGCCATT CAGGCTGCGC 
TACGCCAGCT GGCGAAAGGG 
TTTCCCAGTC ACGACGTTGT 
CAATTTACAC TCAGTTGGTT 
ATACCATAGA CGGAGTGGTA 
AAATAAATGC ACTTTAGATG 
TAATGTTTCT AAAAATAGTT 
ATTTCTACTG TCTCTCATAT 
CTGCTGCTAC ACTCGCCCTC 
GTTCTGATGC AGTTTTCGAT 
CACCAGCACA GCAGTGTCGT 
ACTCGGCCAC GTAGTTAGCG 
CGTCGACCGC GCGCGTGCAT 
AGCGCGATGG GGAGGAGGGC 
AGCAAGGAGG ACGATGCCTT 
GAAGTGCCCC AGAAAGCCGG 
ATATACCCCC GAGGCAAGAC 
GCGGCAAGAG CTGCCGGCTG 
ACATCTCCTA CGACGAGGAG 
CTGTGCAGTG GCCAGTGGTG 
GAGCGTCTGC TGCGAATTCA 
ATGTACATGC GTGTTGGCGC 
AGACAATGAA ATCAAGAACT 
CGGCGCCGGC GGCAGCTGGG 
CGCGACGTCG GGCGCCTGCG 
CTCAGCCGGG ACGACGACGA 
CACGGGCGGA CTCTTCTTCT 
GACGCCAATG GCCGGTGGAG 
CAGCTCGGCG GCGTGGGTAT 
CGACGGTGAC GGCGACTGGA 
CGAGGACTGG CTCCGCTGTC 
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CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGGCGCC 240 

AACTGTTGGG AAGGGCGATC GGTGCGGGCC TCTTCGCTAT 300 

GGATGTGCTG CAAGGCGATT AAGTTGGGTA ACGCCAGGGT 360 

AAAACGACGG CCAGTGAATT CGCTTACGGT CTCAAACAAG 420 

GTAATATGTG GACAATAAAA CTACAAACTA GACACAAATC 480 

GCAGAGGGTA CGCGCGAGGG TGAGATAGAG GATTCTCCTA 540 

GGTAGGGTGG GGTGAGGCCT CTCCTAAAAt GAAACTCGTT 600 

TTCACTGGTG ATCCTTAGTT ACTGGCATGT AAAAATGATG 660 

GGACGGTTAT AAAAAATACC ATTATATTGA AAATAGGTCT 720 

ATAGCAGATC ATGCATGCAC GCATCATTCG ATCAGTTTTC 780 

AAATGCCAAT TTTTTAACTG CATACGTTGC CCTTGCTCAG 840 

GTCGTCCATG CATGCACTTT AGGTGCAGTG CAGGGCCTCA 900 

CCACTGCTAC AGATCGAGGC ACCGGTCAGC CGGCCACGCA 960 

TTAAATACGC CGACGACGGA GCTTGATCGA CGAGAGAGCG 1020 

GTGTTGCGCG AAGGAAGGCG TTAAGAGAGG GGCGTGGACG 1080 

GGCCGCCTAC GTCAAGGCCC ATGGCGAAGG CAAATGGAGG 1140 

TAAAACTAGC TAGTCTTTTT ATTTCATTTt GGGATCATAT 1200 

CGGAGGACGA TGACGTGTGT GGGTGCAGGT TTGCGTCGGT 1260 

CGGTGGCTGA ACTACCTCCG GCCCAACATC AGGCGCGGCA 1320 

GATCtCATCA TCCGCCTCCA CAGGCTCCTC GGCAACAGGT 1380 

GGCTAGCTTA TTACACGAGC TGACGACGAG GCGATCGATC 1440 

TCTGTTCCGG TGTCGGCCGT GTGAGAGTGA GCTCATTCAT 1500 

GCAGGTGGTC GCTGATTGCA GGCAGGCTGC CTGGCCGAAC 1560 
ACTGGAACAG CACGCTGGGC CGGAGGGCAG GCGCCGGCGC ' 1620 

TCGTCGTCGC GCCGGACACC GGCTCGCACG CCACCCCGGC 1680 

AGACCGGCCA GAATAGCGCC GCTCATCGCG CGGACCCCGA 1740 

CCTCGGCGGC GGCGGTGTGG GCGCCCAAGG CCGTGCGGTG 1800 

TCCACCGGGA CACGACGCCG GCGCACGCGG GCGAGACGGC 1860 

GTGGAGGAGG AGGAGGAGAA GCAGGGTCGT CGGACGACTG 1920 

CGCTTCGCGT CGGAAGCCAC GACGAGCCGT GCTTCTCCGG 1980 

TGGACGACGT GAGGGCCCTG GCGTCGTTTC TCGAGTCCGA 2040 

AGACGGCCGG GCAGCTTGCG TAGACAACAA GTACACGTAT 2100 
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TAAGPACGAG 


GCCCGCGAGC 


CCGGCACGAA 


GCCCGCTTTT 


TGGGCCCGGT 


2160 


^ ^ ^ o o 


CACGGCCCGG 


TTATATGCAG 


ACCCGGGCCG 


GCCCGGCACG 


AATAAGCGGG 


2220 


CCGGGCTCGG 


ACAGGAAATT 


AGGCACGGTG 


AGCTAGCCCG 


GCACGGCCCG 


TTTAGGTCTA 


2280 




GCCCGTTTTT 


TTACACTAAA 


ACGTGCTTCT 


CGGCCCGCAT 


AGCCCGCTTC 


2340 




TTTTTCGTGC 


TAAACGGGCC 


GGCCCGGCCC 


GGTTTAGGCC 


CGTTGCGGGC 


2400 




CAGGAAATTG 


AGCCCGCGTG 


CTTAGCCGGC 


CCGGCCCGGT 


TTTTTAATCG 


2460 




GCGAGGCCCA AAACGGGCCG 


GGCTTCACCG 


GGCCCGGGCC 


GGACCGGGCC 


2520 


Cj u u t, b b UUvj 


TTTGGACATC 


TCTAAGTACA 


CGTATGGAGG 


AGAATATATA 


TATAGTCATG 


2580 




GGCGTAATCA TGGTCATAGC 


TGTTTCCTGT GTGAAATTGT 


TATCCGCTCA 


2640 


r* a 2i T T rr* n p 2i 


CAACATACGA 


GCCGGAAGCA 


TAAAGTGTAA 


AGCCTGGGGT 


gcctaatgAg 


2700 


T L P T A 2V. r T 


CACATTAATT 


GCGTTGCGCT 


CACTGCCCGC 


TTTCCAGTCG 


GGAAACCTGT 


2760 


Ub i. bWU/ibU i 


GCATTAATGA 


ATCGGCCAAC 


GCGCGGGGAG 


AGGCGGTTTG 


cgtattgggc 


2820 




TTCGTCGCTC 


ACTGACTCGC 


TGCGCTCGGT 


CGTTCGGCTG 


CGGCGAGCGG 


2880 




CTCAAAGGCG 


GTAATACGGT 


TATCCACAGA 


ATCAGGGGAT 


AACGCAGGAA 


2940 


2VfXn ZXPATf^T^Z 


AGCAAAAGGC 


CAGCAAAAGG 


CCAGGAACCG 


TAAAAAGGCC 


GCGTTGCTGG 


3000 


CGTTTTTCCA 


TAGGCTCCGC 


CCCCCTGACG 


AGCATCACAA 


AAATCGACGC 


TCAAGTCAGA 


3060 


GGTGGCGAAA 


CCCGACAGGA 


CTATAAAGAT 


ACCAGGCGTT 


TCCCCCTGGA 


AGCTCCCTCG 


3120 


TGCGCTCTCC 


TGTTCCGACC 


CTGCCGCTTA 


CCGGATACCT 


GTCCGCCTTT 


CTCCCTTCGG 


3180 


GAAGCGTGGC 


GCTTTCTCAA 


, TGCTCACGCT 


GTAGGTATCT 


CAGTTCGGTG 


TAGGTCGTTC 


3240 




GGGCTGTGTG 


CACGAACCCC 


CCGTTCAGCC 


CGACCGCTGC 


GCCTTATCCG 


3300 


pfp TV A PT 21 TPC 


TCTTGAGTCC 


AACCCGGTAA 


GACACGACTT 


ATCGCCACTG 


GCAGCAGCCA 


3360 


CTG.GTAACAG 


GATTAGCAGA 


GCGAGGTATG 


TAGGCGGTGC 


TACAGAGTTC 


TTGAAGTGGT 


3420 


GGCCTAACTA 


CGGCTACACT 


AGAAGGACAG 


TATTTGGTAT 


CTGCGCTCTG 


CTGAAGCCAG 


3480 


TTACCTTCGG 


AAAAAGAGTT 


GGTAGCTCTT 


GATCCGGCAA 


ACAAACCACC 


GCTGGTAGCG 


3540 


GTGGTTTTTT 


TGTTTGCAAG 


CAGCAGATTA 


CGCGCAGAAA AAAAGGATCT 


CAAGAAGATC 


3600 


CTTTGATCTT 


TTCTACGGGG 


TCTGACGCTC 


AGTGGAACGA 


AAACTCACGT 


TAAGGGATTT 




TGGTCATGAG ATTATCAAAA AGGATCTTCA 


CCTAGATCCT 


TTTAAATTAA AAATGAAGTT 


3720 


TTAAATCAAT 


CTAAAGTATA 


TATGAGTAAA 


CTTGGTCTGA 


CAGTTACCAA 


TGCTTAATCA 


3780 


GTGAGGCACC 


TATCTCAGCG 


ATCTGTCTAT 


TTCGTTCATC 


CATAGTTGCC 


TGACTCCCCG 


3840 


TCGTGTAGAT 


AACTACGATA 


CGGGAGGGCT 


TACCATCTGG 


CCCCAGTGCT 


GCAATGATAC 


3900 


CGCGAGACCC 


ACGCTCACCG 


GCTCCAGATT 


TATCAGCAAT 


AAACCAGCGA 


GCCGGAAGGG 


3960 
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CCGAGCGCAG 


AAGTGGTCCT 


GCAACTTTAT 


CCGCCTCCAT 


CCAGTCTATT 


AATTGTTGCC 


4020 


GGGAAGCTAG 


AGTAAGTAGT 


TCGCCAGTTA ATAGTTTGCG 


CAACGTTGTT 


GCCATTGCTA 


4080 


CAGGCATCGT 


GGTGTCACGC 


TCGTCGTTTG 


GTATGGCTTC 


ATTCAGCTCC 


GGTTCCCAAC 


4140 


GATCAAGGCG 


AGTTACATGA 


TCCCCCATGT 


TGTGCAAAAA AGCGGTTAGC 


TCCTTCGGTC 


4200 


CTCCGATCGT 


TGTCAGAAGT 


AAGTTGGCCG 


CAGTGTTATC 


ACTCAtGGTT 


ATGGCAGCAC 


4260 


TGCATAATTC 


TCTTACTGTC 


ATGCCATCCG 


TAAGATGCTT 


TTCTGTGACT 


GGTGAGTACT 


4320 


CAACCAAGTC 


ATTCTGAGAA 


TAGTGTATGC 


GGCGACCGAG 


TTGCTCTTGC 


CCGGCGTCAA 


4 380 


TACGGGATAA 


TACCGCGCCA 


CATAGCAGAA 


CTTTAAAAGT 


GCTCATCATT 


GGAAAACGTT 


4440 


CTTCGGGGCG 


AAAACTCTCA 


AGGATCTTAC 


CGCTGTTGAG 


ATCCAGTTCG 


ATGTAACCCA 


4500 


CTCGTGCAGC 


CAACTGATCT 


TCAGCATCTT 


TTACTTTCAC 


CAGCGTTTCT 


GGGTGAGCAA 


4560 


AAACAGGAAG 


GCAAAATGCC 


GCAAAAAAGG 


GAATAAGGGC 


GACACGGAAA 


TGTTGAATAC 


4620 


TCATACTCTT 


CCTTTTTCAA 


TATTATTGAA 


GCATTTATCA 


GGGTTATTGT 


CTCATGAGCG 


4680 


GATACATATT 


TGAATGTATT 


TAGAAAAATA AACAAATAGG 


GGTTCCGCGC 


ACATTTCCCC 


4740 


GAAAAGTGCC 


ACCTGACGTC 


TAAGAAACCA 


TTATTATCAT 


GACATTAACC 


TATAAAAATA 


4800 


GGCGTATCAC 


GAGGCCCTTT 


CGTC 








4824 



(2) INFORMATION FOR SEQ ID NO: 6: : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3915 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS :. double 
(Dl TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(yi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-Hindlll region of plasmid pC0L13 

(ix) FEATURE: 

(A) NAME/KEY: prim^transcript 

(B) LOCATION: 188 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 188, .212 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION:213. .556 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 557. ,718 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 719. .1224 
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(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1225. .2770 

(D) OTHER INFORMATION :/codon_start= 2 

/note= "exon containing 3* end coding region of B-peru gene" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 576. .718 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1225. .2770 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1268. .2770 

(D) OTHER INFORMATION: /note= "3* end of B-peru coding 
region which is derived from cDNA" 

(ix) FEATURE: 

(A) NAME/KEY: 3'UTR 

(B) LOCATION: 2771. .3272 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3273. .3891 

(D) OTHER INFORMATION: /label=. 3* region 

/note= "further 3* flanking region of B-peru gene. This region 
is only of approximate length and the sequence needs to be 
confirmed. " 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:!. .6 

(D) OTHER INFORMATION: /label= EcoRI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 11. .16 

(D) OTHER INFORMATION :/label= Xbal 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 45. ,50 

(D) OTHER INFORMATION :/label= Kpnl 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2 65. .270 

(D) OTHER INFORMATION: /label= Hindlll 

Ux) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 329. .334 

(D) OTHER INFORMATION :/label= Xbal 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 835. . 840 

(D) OTHER INFORMATION :/label= BamHI 
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(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1268. .127 3 

(D) OTHER INFORMATION :/label= Mlul 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION:2787. .2792 

(Dl OTHER INFORMATION :/label= Hindlll 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:2883. -2888 

(D) OTHER INFORMATION: /label= Muni 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2827, .2832 

(D) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3892. .3897 

(D) OTHER INFORMATION: /label= Sail 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3910. .3915 

(D) OTHER INFORMATION :/lab€l= Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3892. .3915 

(D) OTHER INFORMATION: /label= pclylinker 
/note= "part of polylinJcer of pUC19" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



GAATTCAGGT 


TCTAGACTAT 


TCTTGTGGCC 


TCGGGCGGAT 


GGCGGGTACC 


CATGTCTTCG , 


60 


TTAGGCTTAT 


CTGACCGTGG 


AGATGAAATC 


TAACGGCTCA 


TAGAAATTAA ACTAACGTGG 


120 


ACACTCTGTC 


CTTGCTGTTT 


TGCTCCCTGC 


TCTTTATATA 


TAGAATGCCT 


GCTTGCATTG 


180 


CACCCGTACG 


TACAGCGTAG 


CGCGGAGTGG 


AGGTGAGCTC 


CTCCTCCGAT 


TCTTGCCTAA 


240 


TCTTTGGTCT 


TTGCACACGT 


ACGAAAGCTT 


TTTGCATTGT 


TTCGTTGCTT 


CTGGATGATC 


300 


AGTACTCTTA 


GATATTAAGC 


GATACCGATC 


TAGAATCGAG 


TTGTTGTACT 


CTCTCTGTCC 


360 


CTTTTGTGCA 


GCTATAACTA 


GCTAGGTTCC 


TTCGCATAGA 


GCCTCTCTAC 


AGAGTACAGA 


420 


CTAGCTAGCA 


GTGTCAGACA 


CGAAATGGAA 


ATGGTCACTT 


CCAAATTGCA 


CGAGCTGGAA 


480 


TTATATACTC 


TTCTGATCTT 


CTTCACCGTC 


TCTTTATAGC 


GTGATATGCG 


TTTCTGGCTT 


540 


CTTGCTTACG 


TGAAGGATTA TTAGTAAGGC 


GCGTGATGGC 


GCTCTCAGCT 


TCCCCGGCTC 


600 


AGGAAGAACT 


GCTGCAGCCT 


GCTGGGAGGC 


CGTTGAGGAA 


GCAGCTTGCT 


GCAGCCGCGA 


660 


GGAGCATCAA 


CTGGAGCTAT 


GCCCTCTTCT 


GGTCCATTTC 


AAGCACTCAA 


CGACCTCGGT 


720 


AAATGGAAGT 


CCTGATAATC 


TATAATTTGT 


CTGGCAGTTT 


TCTACAACTC 


TGGTGAATGA 


780 
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TCGTCACTTC GTTTGCCTGA TACATACATA CATACATATG AAATAAAGAA AGTCGGATCC 840 

CGTGATGCGA TTGTAGTTAT CGCTTTTCCG ' CAAAATGGTT GCTTTTTGAA TCTGCATTCG 900 

^^^^^^^^.^^ ACATCTTCTT CCTTCTCGCG AGTAACGACA ACGCCACCCG CGCCGCCTGC 960 

CGCCCATCGC CCCGCCTTGG CCGGCGAGAG CCTCAGCCTA TTACACCAGC GGCGACCTCT 1020 

TTTCCCCTTC CTCTCACCGC CCTCGTGGCC GTGCTCTCCC CCGCTCTAAC CTGGTCTGGC 1080 

CGCCTCCGCT GCCACCTGCT CCGGCGGCCT CACCCGCGTC TTTCTCGTCC CTACCCTCTC 1140 

TGCCTCTGGG CGCATCATCA TCTGATATTC TGATGCAAAT AAAAAAGGTA TACCATATAA 1200 

GGACAACAGA AAATATGGTT GCAGGGTGCT GACGTGGACG GACGGGTTCT ACAATGGCGA 1260 

GGTGAAGACG CGTAAGATCT CCCACTCCGT GGAGCTGACA GCCGACCAGC TGCTCATGCA 1320 

GAGGAGCGAG CAGCTCCGGG AGCTCTACGA GGCCCTCCGG TCCGGCGAGT GCGACCGCCG 1380 

CGGCGCGCGG CCGGTGGGCT CGCTGTCGCC GGAGGACCTC GGGGACACCG AGTGGTACTA 144 0 

CGTGATCTGC ATGACCTACG CCTTCCTGCC GGGCCAAGGC TTGCCCGGCA GGAGTTCCGC 1500 

GAGCAACGAG CATGTCTGGC TGTGCAACGC GCACCTCGCC GGCAGCAAGG ACTTCCCCCG 1560 

CGCGCTCCTG GCCAAGAGCG CGTCCATTCA GACAATCGTC TGCATCCCGC TCATGGGTGG 1620 

CGTGCTTGAG CTTGGTACTA CTGATAAGGT GCCGGAGGAC CCGGACTTGG TCAGCCGAGC 1680 

AACCGTAGCA TTCTGGGAGC CGCAATGTCC GACATACTCG AAAGAGCCGA GCTCCAACCC 1740 

GTCAGCATAC GPJKlKCCGGGG ;^AGCCGCATA CATAGTCGTG TTGGAGGACC TCGATCACAA 1800 

TGCCATGGAC ATGGAGACGG TGACTGCCGC CGCCGGGAGA CACGGAACCG GACAGGAGCT 1860 

AGGAGAAGTC GAGAGCCCGT CAAATGCAAG CCTGGAGCAC ATCACCAAGG GGATCGACGA 1920 

GTTCTACAGC CTCTGCGAGG AAATGGACGT GCAGCCGCTA GAGGATGCCT GGATAATGGA 1980 

CGGGTCTAAT TTCGAAGTCC CGTCGTCAGC GCTCCCGGTG GATGGCTCAA GCGCACCCGC 2040 

TGATGGTTCT CGCGCGACAA GTTTCGTGGT TTGGACGAGG TCATCGCACT CCTGCTCGGG 2100 

TGAAGCGGCG GTGCCGGTCA TCGAAGAGCC GCAGAAATTG CTGAAGAAAG CGTTGGCCGG 2160 

CGGCGGTGCT TGGGCGAACA CGAACTGCGG TGGCGGGGGC ACGACGGTAA CAGCCCAGGA 2220 

AAACGGCGCC AAGAACCACG TCATGTCAGA GCGAAAGCGC CGGGAGAAGC TCAACGAGAT 2280 

GTTCCTCGTT CTCAAGTCGT TGGTTCCCTC CATTCACAAG GTGGACAAAG CATCCATCCT 2340 

CGCCGAAACG ATAGCCTATC TAAAGGAGCT TCAACGAAGG GTACAAGAAC TGGAATCCAG 2400 

GAGGCAAGGT GGCAGTGGGT GTGTCAGCAA GAAAGTCTGT GTGGGCTCCA ACTCCAAGAG 2460 

GAAGAGCCCA GAGTTCGCCG GTGGCGCGAA GGAGCACCCC TGGGTCCTCC CCATGGACGG 2520 

CACCAGCAAC GTCACCGTCA CCGTCTCGGA CACGAACGTG CTCCTGGAGG TGCAATGCCG 2580 

GTGGGAGAAG CTCCTGATGA CACGGGTGTT CGACGCCATC AAGAGCCTCC ATTTGGACGC 2640 
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TCTCTCGGTT 


CAGGCTTCGG 


CACCAGATGG 


CTTCATGAGG 


CTCAAGATAG 


GAGCTCAGTT 


2700 


TGCAGGCTCC 


GGCGCCGTCG 


TGCCCGGAAT 


GATCAGCGAA 


TCTCTTCGTA 


AAGCTATAGG 


2760 


GAAGCGATGA AAGGGCGCTA 


CATGTGAAGC 


TTAATTAATG 


GAAGCAAACT 


TGTATTTCTT 


2820 


GTGCAAAAGC 


TTACTATATA 


TTTCTGCAAA 


ACCTGGTGTG 


CCTTGTTTTG 


ATTTTCAGTC 


2880 


GCCAATTGTG 


CCTTTGTTTT 


TATCAAGTGA 


TGATCTACAC 


ATATATATAG 


GAATATTTGA 


2940 


AAAGAGCGAT 


GTCATAGGGT 


TTTTTTATTA 


CAAGGAACAA 


GTCTTTCACG 


TGCTGGCCTC 


3000 


ACAAATCCTA AGAGAAAATC 


TGCTCATTTT 


GATTGCGTTC 


CGCAACAACT 


CTGTAATCCA 


3060 


TATCCTATGT 


ATCCGATCAA 


CTAGTCGATA 


GCCTCCGTCC 


GCCACATCAT 


CATATATCTA 


3120 


TCTATGTGTG 


TCATCTGACA 


CATACTCCTC 


GCGTACTGTG 


CTGACATATG 


ATACTGACAC 


3160 


AGCATATATG 


CATGCACATC 


GTCACACGAC 


ATATATCTCG 


CTACTACACA 


GATATTGGAT 


3240 


ACGATACTAT 


ATAGCATCAT 


GCGTGCTGCG 


ATNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3300 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3360 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3420 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3480 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3540 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


■NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3600 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3660 


NNNNNNNNNN 


NNNNNNNNNN 


, NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNIIN 


3720 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3780 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3840 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NGTCGACCTG 


3900 


CAGGCATGCA 


AGCTT 










3915 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4137 base pairs - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-Hindlll region of plasmid pCOL13 

(ix) FEATURE: 

(A) NAME/KEY: prim_transcript 

(B) LOCATION: 188 

(ix) FEATURE: 

(A) NAME/ KEY: exon 
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(B) LOCATION: 188. .212 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 213. .556 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 557. .718 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION:719..1224 

Ux) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1226. .2771 

(D) OTHER INFORMATION: /codon_start= 2 

/note= "exon containing 3' end coding region of B-peru gene, 
this exon continues up to the polyadenylation site." 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:576. .718 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1226, .2771 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1269. .2771 

(D) OTHER INFORMATION: /note= "fragment of B-peru coding 
region which is derived from cDNA" 

(ix) FEATURE: 

(A) NAME/KEY: 3*UTR 

(B) LOCATION: 2772, .4137 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .6 

(D) OTHER INFORMATION: /label= EcoRI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 11. .16 

(D) OTHER INFORMATION :/label= Xbal 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 45. .50 

(D) OTHER INFORMATION :/label= Kpnl 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 265. ,270 

(D) OTHER INFORMATION :/label= Hindi I I 

(ix) FEATURE: 

. (A) NAME/KEY: - 
(B) LOCATION: 329. .334 
(D) OTHER INFORMATION :/label= Xbal 
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(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 835. .840 

. (D) OTHER INFORMATION :/label= BamHI 

(ix) FEATURE: 

(A) NAME/KEY:' - 

(B) LOCATION: 1269. .1274 

{Dl OTHER INFORMATION: /label= Mlul 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2788. .2793 

AD) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

(Al NAME/KEY: - , 

(B) LOCATION: 2884. .2889 

(D) OTHER INFORMATION: /labels Muni 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2828. .2833 

(D) OTHER INFORMATION : /label=- Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 4 114. .4119 

(D) OTHER INFORMATION: /label= Sail 

(ix) FEATURE; 

(A) NAME/ KEY: - 

(B) LOCATION: 4132. .4137 

(D) OTHER INFORI-IATION: /label= Hindlll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 4 114. .4137 

(D) OTHER INFORMATION :/label= polylin)cer 
/note= "part of polylinlcer of pUC19" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GAATTCAGGT 


TCTAGACTAT 


TCTTGTGGCC 


TCGGGCGGAT 


GGCGGGTACC 


CATGTCTTCG 


60 


TTAGGCTTAT 


CTGACCGTGG 


AGATGAAATC 


TAACGGCTCA 


TAGAAATTAA 


ACTAACGTGG 


120 


ACACTCTGTC 


CTTGCTGTTT 


TGCTCCCTGC 


TCTTTATATA 


TAGAATGCCT 


GCTTGCATTG 


180 


CACCCGTACG 


TACAGCGTAG 


CGCGGAGTGG 


AGGTGAGCTC 


CTCCTCCGAT 


TCTTGCCTAA 


240 


TCTTTGGTCT 


TTGCACACGT 


ACGAAAGCTT 


TTTGCATTGT 


TTCGTTGCTT 


CTGGATGATC 


300 


AGTACTCTTA 


GATATTAAGC 


GATACCGATC 


TAGAATCGAG 


TTGTTGTACT 


CTCTCTGTCC 


360 


CTTTTGTGCA 


GCTATAACTA 


GCTAGGTTCC 


TTCGCATAGA 


GCCTCTCTAC 


AGAGTACAGA 


420 


CTAGCTAGCA 


GTGTCAGACA 


CGAAATGGAA 


ATGGTCACTT 


CCAAATTGCA 


CGAGCTGGAA 


480 


• TTATATACTC 


TTCTGATCTT 


CTTCACCGTC 


TCTTTATAGC 


GTGATATGCG 


TTTCTGGCTT 


540 


CTTGCTTACG 


TGAAGGATTA 


TTAGTAAGGC 


GCGTGATGGC 


GCTCTCAGCT 


TCCCCGGCTC 


600 
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AGGAAGAACT 


GCTGCAGCCT 


GCTGGGAGGC 


CGTTGAGGAA GCAGCTTGCT 


GCAGCCGCGA 


660 


GGAGCATCAA 


CTGGAGCTAT GCCCTCTTCT GGTCCATTTC AAGCACTCAA CGACCTCGGT 


720 


AAATGGAAGT 


CCTGATAATC 


TATAATTTGT 


CTGGCAGTTT TCTACAACTC 


TGGTGAATGA 


. 780 


TCGTCACTTC 


GTTTGCCTGA TACATACATA CATACATATG AAATAAAGAA 


AGTCGGATCC 


840 


CGTGATGCGA TTGTAGTTAT 


CGCTTTTCCG 


CAAAATGGTT GCTTTTTGAA 


TCTGCATTCG 


900 


TTTTTTTCCC 


ACATCTTCTT 


CCTTCTCGCG 


AGTAACGACA ACGCCACCGC GCGCCGCCTG 


960 


CCGCCCATCG 


CCCCGCCTTG 


GCCGGCGAGA 


GCCTCAGCCT ATTACACCAG 


CGGCGACCTC 


1020 


TTTTCCCCTT 


CCTCTCACCG 


CCCTCGTGGC 


CGTGCTCACC CCCGCTCTAA 


CCTGGTCTGG 


1080 


CCGCCTCCGC 


TGCCACCTGC 


TCCGGCGGCC 


TCACCCGCGT' GTTTCTCGTC 


CCTACCCTCT 


1140 


CTGCCTCTGG 


GCGCATCATC 


ATCTGATATT 


CTGATGCAAA GAAAAAAGGT 


ATACCATATA ■ 


1200 


AGGACAACAG 


AAAATATGGT 


TGCAGGGTGC 


TGACGTGGAC GGACGGGTTC 


TACAATGGCG 


1260 


AGGTGAAGAC 


GCGTAAGATC 


TCCCACTCCG 


TGGAGCTGAC AGCCGACCAG 


CTGCTCATGC 


1320 


AGAGGAGCGA GCAGCTCCGG 


GAGCTCTACG 


AGGCCCTCCG GTCCGGCGAG 


TGCGACCGCC 


1380 


GCGGCGCGCG 


GCCGGTGGGC 


TCGCTGTCGC 


CGGAGGACCT CGGGGACACC 


GAGTGGTACT 


1440 


ACGTGATCTG 


CATGACCTAC 


GCCTTCCTGC 


CGGGCCAAGG CTTGCCCGGC 


AGGAGTTCCG 


1500 


CGAGCAACGA 


GCATGtCTGG 


CTGTGCAACG 


CGCACCTCGC CGGCAGCAAG 


GACTTCCCCC 


.1560 


GCGCGCTCCT 


GGCCAAGAGC 


GCGTCCATTC 


AGACAAtCGT CTGCATCCCG 


CTCATGGGTG 


1620 


GCGTGCTTGA 


GCTTGGTACT 


ACTGATAAGG 


TGCCGGAGGA CCCGGACTTG 


GTCAGCCGAG 


1680 


CAACCGTAGC 


ATTCTGGGAG 


CCGCAATGTC 


CGACATACTC GAAAGAGCCG 


AGCTCCAACC 


1740 


CGTCAGCATA 


CGAAACCGGG 


GAAGCCGCAT 


ACATAGTCGT GTTGGAGGAC 


CTCGATCACA 


1800 


ATGCCATGGA 


CATGGAGACG 


GTGACTGCCG 


CCGCCGGGAG ACACGGAACC 


GGACAGGAGC 


1860 


TAGGAGAAGT 


CGAGAGCCCG 


TCAAATGCAA 


GCGTGGAGCA CATCACCAAG 


GGGATCGACG 


1920 


AGTTCTACAG 


CCTCTGCGAG 


GAAATGGACG 


TGCAGCCGCT AGAGGATGCC 


TGGATAATGG 


1960 


ACGGGTCTAA TTTCGAAGTC 


CCGTCGTCAG 


CGCTCCCGGT GGATGGCTCA AGCGCACCCG 


2040 


CTGATGGTTC 


TCGCGCGACA AGTTTCGTGG 


TTTGGACGAG GTCATCGCAC 


TCCTGCTCGG 


2100 


GTGAAGCGGC 


GGTGCCGGTC 


ATCGAAGAGC 


CGCAGAAATT GCTGAAGAAA 


GCGTTGGCCG 


^1 fin 


GCGGCGGTGC 


TTGGGCGAAC 


ACGAACTGCG 


GTGGCGGGGG CACGACGGTA ACAGCCCAGG 


2220 


AAAACGGCGC 


CAAGAACCAC 


GTCATGTCAG 


AGCGAAAGCG CCGGGAGAAG 


CTCAACGAGA 


2280 


TGTTCCTCGT 


TCTCAAGTCG 


TTGGTTCCCT 


CCATTCACAA GGTGGACAAA 


GCATCCATCC 


2340 


TCGCCGAAAC 


GATAGCCTAT 


CTAAAGGAGC 


TTCAACGAAG GGTACAAGAA 


CTGGAATCCA 


2400 


GGAGGCAAGG 


TGGCAGTGGG 


TGTGTCAGCA 


AGAAAGTCTG TGTGGGCTCC 


AACTCCAAGA 


2460 
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GGAAGAGCCC 


AGAGTTCGCC 


GGTGGCGCGA 


AGGAGCACCC 


CTGGGTCCTC 


CCCATGGACG 


2520 


GCACCAGCAA 


CGTCACCGTC 


ACCGTCTCGG 


ACACGAACGT 


GCTCCtGGAG 


GTGCAATGCC 


2580 


GGtGGGAGAA 


GCTCCTGATG 


ACACGGGTGT 


TCGACGCCAT 


CAAGAGCCTC 


CATTTGGACG 


2640 


CTCTCTCGGT 


TCAGGCTTCG 


GCACCAGATG 


GCTTCATGAG 


GCTCAAGATA 


GGAGCTCAGT 


2700 


TTGCAGGCTC 


CGGCGCCGTC 


GTGCCCGGAA TGATCAGCCA ATCTCTTCGT AAAGCTATAG 


. 2760 


GGAAGCGATG 


AAAGGGCGCT 


ACATGTGAAG 


CTTAATTAAT 


GGAAGCAAAC 


TTGTATTTCT 


2820 


TGTGCAAAAG 


CTTACTATAT 


ATTTCTGCAA AACCTGGTGT 


GCCTTGTTTT 


GATTTTCAGT 


2880 


CGCCAATTGT 


GCCTTTGTTT 


TTATCAAGTG 


ATGATCTACA 


CTATATATAT 


GGAATATTTG 


2940 


AAAAGAGCGA 


TGTCATAGGG 


TTTTTTTATT 


ACAAGGAACA AGTCTTTCAC 


GTGCTGGCCT 


3000 


CACAAATCCA 


AGAGAAAATC 


TGCTCATTTT 


GATTGGCTTC 


CGCAACAACT 


CTGTAATCCA 


3060 


TATCCTTTGT 


ATCCGATCAA 


CTATGATACC 


TCCTCCCCCA 


TCTCTTTTTT 


TTTTATCTGC 


3120 


ACAATCTTCT 


ATTCTACTAT 


AATGAAACAA 


TAGAGCCACT 


ACCGAATATT 


TCCTCAAAAA 


3180 


TGTACAACAA 


ACTAGGGTGG 


TCCAAACAAA 


TGCCTAGAGG 


AGCTAGATtC 


TCTTAAATTA 


3240 


GACATCGGTT 


TCTTTTATCT 


CTTCCAGAAG 


GGATAAAAGT 


ATGTGTTTAT 


GGT'CTTCAGT 


3300 


AATACATTGT 


TCGTTTCTCA 


TAGTCAATTT 


AGAGGTGTTT 


AAATGTACTT 


GAACTAATAG 


3360 


TTAGTTGGTT 


TAAAAATTAC 


TATTAAAATT 


AGTTAGTTAA 


TAAATAGCTA 


GCTAAATATT 


3420 


AGCTAATTTG 


TCAAAAGTAG 


CTAATAGCTG 


AATTATTAGC 


TATATTGTTT 


TGATGTCTTC 


3480 


AGCTAATTTT 


AGCAGATCAT 


tattagttct' 


AGTGTATCTA 


AACACACCCT 


TAGTCAAACA 


3540 


TGGTAAAAAA 


AAAGTTGATT 


CACTCATTGC 


TCATCGAAGA 


CGCAGATCAT 


GGCATCCCTC 


3600 


ACACGTTCTT 


CAGCCTACAC 


GGCACTTGCA TTGTAATTGC ATCTCATCTC ATCAACCCTT 


3660 


GTTGTGCATT ACTTGCCACA TGCGCCATCA ATTAACATTT TTTTGTCTCG 


TTCCTGAATT 


3720 


TCCTAACAAA TTTCATCAAA TGTACGCAGA GCTAAAGCTA GCTGTCGATG 


TCAGTTGACA 


3780 


GTTGACACCG 


ATGAATTTTA 


GAAAATTTAG 


TGTAAAGTAC 


TATTTATAAT 


GTTCATGACA 


3840 


CCCATATAAA 


ATATGTTGAC 


ACCGGCAAAC 


CTCAAGGCTA 


GCTTCGCCCC 


TGCCATCAAC 


3900 


CTTACATCTA 


CATTCACCAC 


GAGGTGTGCA 


CGGCCTAGGT 


TCGACTCCTA 


TGTCATGCCT 


3960 


TGCTATCTAC 


AGATTCAGCA AGTGTTGTGT 


TCCTTGTTGT 


CACAATCTAC 


CTTTATTATA 


4020 


AAATTGATGT 


CATATCATGC 


CAAACAACAA ATAATTAATA 


TCGTGTGAAA 


TTTGAATTTC 


4080 


TCTAACATGC 


TCAACCAACC 


TTACCCCTTC 


ACGGTCGACC 


TGCAGGCATG 


CAAGCTT 


4137 
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SEQ. ID No. 7 



actagtacctgtcgcgcgcccatgcgcgcgtggcgtgcttctcgccctggtaactgttctcggcaaatga 
ctatttccaagtaaacatattcaatgattttgctattcttagcaaagtaatttcacttggacttttgtgc 
caaaaacgcattggaaaaaatctccttggactccagcctaaggttgaaagtgtaaaaactgggaaaaatt 
attgatgtttcgggcagttacttggctatgtaaattccataccttttcaaaatatcctaaacattctttt 
ctgtttctgcaacatacatgtttatcagttctggacctttgpacgctacgaaagttcagtgagtattcagg 
ctttcgcaagtaaaacctagaagtccaacggacattcattttagcgatcccatgtctttaggatgcactt 
gttatcggatgtctcctatgagacagaatgcacttgttatggtaactaaacaaaaaaatataatttaatt 
cgtgtgaaactttttcaaacctaccttccctgttcccggaggtccatatacccagacacctaatcgcttg 
cgcaatttagaagaaatcatgcgattatacgtcaaagggagctgaaatatcaagcaaaagaaaaggtcat 
cccacaaaagcccaaaactattgtagggaaaacacttgttttacctataattgagcgtcgtattggtgtt 
gctgatatttactgctaaaccaagtccaatttaccagaatagtatctagaagaatccttttcacatcctc 
tagcccgccaacatcctaccatttgacattgagaactaaaaaacaaattgttcccagacgaaagctaaag 
tcgctttatacgattagctgcagtaggtgagcacgatctccgaacgctgggcatgacacgaccatgatag 
acgacatggacattttgtcaaacacctgcatggcgtcaccagggaaaacaatccagcaggagagttggga 
gagagatggaaacaattaattatgcaaacacggaggagacacaatttgaagagtgttcgtacacctacgg 
caatcagcgaaacgatgagagagcataccaagctcgggtcgtcagacacgcggaggacggacggtggcac 
cgatggagatggagacagttgcgtgccgttttttgtggagggcttcgttggtgtcgggcgtcggcggagc 
ctgaacgcggtgggaagaagagcggcgtggtgggaagaagagcgacgtcaggttctagactattcttgtg 
gcctcgggcggatggcgggtacccatgtcttcgttaggcttatctgaccgtggagatgaaatctaacggc 
tcatagaaattaaactaacgtggactcccagacgaaagctaaagtcgctttatacgatcagctgcagtag 
gtgagcacgatctccgaacgctgggcatgacacgaccatgatagacgacatggacattttgtcaaacacc 
tgcatggcgtcaccagggaaaacaatccagcaggagagttgggagagagatggaaacaattaattatgca 
aacacggaggagacacaatttgaagagtgttcgtacacctacggcaatcagcgaaacgatgagagagcat 
accaagctcgggtcgtcagcaacgcggaggacggacggtggcaccgatggagatggagacagttgcgtgc 
cgttttttgtggagggcttcgttggtgtcgggcgtcggcggagcctgaacgcggtgggaagaagagcttc 
gtggtgggaagaagagcgacgacaggttctagactattcttgtggcctcgggcggatggcgggtacccat 
gtcttcgttaggcttatctgaccgtggagatgaaatctaacggctcatagaaattaaactaacgtggaca 
ctctgtcGttgctgttttgctccctgctctttatatatagaatgcctgcttgcattgcacccgtacgtac 
agcgtagcgcggagtccragatgagctcctcctccgattcttgcct&atctttggtctttgcjvcaagtacg 
aaagctttttgcattatttcgtcgcttctggatgatcagtactcttagatattaagcgataccgatctag 
aatcgagttgttgcactctctccgtcccttctgtgcagctataactagctaggttccttcgcatagagcc 
tctctacagagtacagactagctagcagtgtcagacacgaaatggaaatggtcacttccaaattgcacga 
gctggaattatatactcttctgatcttcttcaccgtctctttatagcgtgatatgcgtttctggcttctt 
gcttacgtgaaggattattagtaaggcgcgtgatggcgctctcagcttccccggctcaggaagaactgct 
gcagcctgctgggaggccgttgaggaagcagcttgctgcagccgcgaggagcatcaactggagctatgcc 
ctcttctggtccatttcaagcactcaacgacctcggtaaatggaagtcctgataatctataatttgtctg 
gcagttttctacaactctggtgaatgatcgtcacttcgtttgcctgatacatacatacatacatatgaaa 
taaagaaagtcggatcccgtgatgcgattgcagttatcgcttttccgcaaaatggttgctttttgaatct 

gel • 
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CLAIMS 

1. A plant consisting essentially of cells which comprise 
in their genome: 

- a homozygous male-sterility genotype at a first genetic 
locus; and 

- a color-linked restorer genotype, at a second genetic 
locus, which is heterozygous (Rf/-) f or a foreign DNA Rf 
comprising: 

a) a fertility-restorer gene capable of preventing the 
phenotypic expression of said male-sterility 
genotype , and 

b) at least one anthocyanin regulatory gene involved in 
the regulation of anthocyanin biosynthesis in cells 
of seeds of said plant which is capable of producing 
anthocyanin at least in the seeds of said plant, so 
that anthocyanin production in the seeds is visible 
externally, 

2. The plant of claim 1 in which said color gene is 
capable of producing anthocyanin at least in the aleurone 
of the seeds of said plant. 
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3. The plant of claim 1, in which said first genetic 
locus is homozygous for a foreign RNA S (S/S) which 
comprises a male-sterility gene which when generated in 
cells of the plant renders the plant male-sterile without 
otherwise substantially affecting the growth and 
development of the plant. 

4 • The plant of claim 1., in which said first genetic 
locus is homozygous for a foreign DNA S (S/S) which 
comprises a male-sterility gene which comprises: 
si) a male-sterility DNA encoding a RNA, protein or 
polypeptide which, when produced or overproduced in a 
stamen cell of the plant, significantly disturbs the 
metabolism, functioning and/or development of said 
cell, and, 

s2) a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in the stamen 
cells, preferably the tapetum cells, .of the plant; 
the male-sterility DNA being in the same 
transcriptional unit as, and under the control of, 
the sterility promoter, 
and in which said fertility restorer gene in said second 

genetic locus comprises at least: 
al) a fertility-restorer DNA encoding a restorer RNA, 
protein or polypeptide which, when produced or 
overproduced in the same cell as said male-sterility 
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gene S, prevents the phenotypic expression of S, 
and, 

a2) a restorer promoter capable of directing expression 
of the fertility-restorer DNA at least in the same 
5 cells in which said male-sterility gene is 

expressed, so that the phenotypic expression of said 
male-sterility gene is prevented; the fertility- 
restorer DNA being in the same transcriptional unit 
as, and under the control of, the restorer promoter. 

10 

5, The plant of claim 1 in which said male-sterility DNA 
encodes barnase and in which said fertility restorer DNA 
encodes barstar. 

15 6. The plant , of claim 1 in which the sterility promoter 

and/or the restorer promoter is selected from the group 
consisting of PTA29, PCA55, PT72 , PT42 , . and PEL 

7. The plant of claim 1 in which the homozygous male- 
20 steriity genotype is endogenous and is homozygous for a 

recessive allele m (m/m) and in which the fertility 
restorer gene is the dominant allele M of said endogenous 
male-sterility genotype. 

25 8. The plant of claim 1, which is a cereal plant which is 

selected from the group consisting of corn, wheat, and 
rice. 
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9. The plant according to claim 1 wherein said 
anthocyanin regulatory gene is a shortened R, B or CI 
gene or a coinbination of shortened R, B or CI genes which 

5 is functional for conditioning and regulating anthocyanin 

production in the aleurone. 

10. The plant according to claim 9 wherein said 
anthocyanin regulatory gene is selected from the group 

10 consisting of a shortened CI or Cl-S gene having a 

nucleotide sequence corresponding to the sequence between 
positions 447 and 2418 of SEQ ID No. 1, a shortened B- 
peru gene having a nucleotide sequence corresponding to 
the sequence between positions 1 and 3272 of SEQ of ID 

15 NO. 6; and the Eco-Sall fragment having a length of about 

4 aaa bp of pC0L13. 

11. The plant according to claim 10 wherein said 
anthocyanin regulartory gene does not contain any 

20 introns. 

12. The plant according to claim 9 wherein said 
anthocyanin regulatory gene comprises a shortened CI or 
Cl-S gene and a shortened B-peru gene. 

25 '■ 

13. The plant according to claim 9 wherein said 
anthocyanin regulatory gene is a chimaric DNA comprising 
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a coding region of an R or B gene and/or CI gene operably 
linked to an aleurone-specif ic promoter. 

14. The plant according to claim 13 wherein said 
5 aleurone-specific promoter is selected from the group 

consisting of: the sequence between positions 1 to 1077 
or 447 to 1077 of SEQ ID No. 1, and the sequence between 
positions 1-575 of sequence ID No. 6. 

10 15. The plant according to claim 14, wherein said 

aleurone-specific promoter is selected from the group 
consisting of: the sequence between positions 1 to 1061 
or 447 to 1061 of SEQ ID No. 1, and the sequence between 
positions 1 to 188 of SEQ ID No. 6. 

15 

16. A DNA comprising an anthocyanin regulatory gene which 
is a shortened R, B or CI gene or a combination of 
shortened R, B or CI genes which is functional for 
conditioning and regulating anthocyanin production in the 

.20 aleurone. 

17. A DNA according to claim 16, which comprises a 
shortened CI or Cl-S gene and a shortened B-peru gene. 

25 18. A DNA according to claim 16, which comprises at least 

one gene selected from the group consisting of a 
shortened B-peru gene having a nucleotide* sequence 
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corresponding to the sequence between positions 1 and 
3272 of SEQ IP No. 6, a shortened B-peru gene which is 
the EcoRI-Sall fragment with a length of about 4 000 bp 
of pC0L13 and the shortened Cl or Cl-S gene having a 
5 nucleotide sequence corresponding to the sequence between 

positions 447 and 2418 SEQ ID No. 1- 

19. The DNA of claim 18 in which said shortened B-peru, 
Cl or Cl-S gene is further characterized by not 

10 containing any intron. 

20. A DNA according to claim 16, wherein said shortened 
Cl, Cl-S or B-peru genes are operably linked to an 
aleurone-specif ic promoter selected from the group 

15 consisting of: the sequence between positions 1 to 1077 

or 447 to 1077 of SEQ ID No. 1 and the sequence betv/een 
positions 1-575 of ID No. 6. 

21. A DNA according to claim 19, wherein said aleurone- 
20 specific promoter is selected from the group consisting 

of: the sequence between positions 1 to 1061 or 447 and 
1061 of SEQ ID No. 1 and the sequence between positions 1 
to 188 of SEQ. ID No. 6. 

25 22. A DNA according to claim 16 which further comprises a 

fertility-restorer gene capable of preventing the 
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phenotypic expression of a male-sterioity genotype in a 
plant. 

23. A DNA according to claim 22 wherein said fertility- 
restorer gene encodes barstar. 

24- A DNA according to claim 23, wherein barstar is under 
the control of a promoter selected from the group 
consisting of PTA29, PCA55, PT72, PT42 and PEl. 

25. An aleurone-specific promoter selected from the group 
consisting of: the sequence between positionis 1 to 1077 
or 447 to 1077 of SEQ ID No. 1 and the sequence between 
positions 1-575 of SEQ ID No. 6. 

26. An aleurone-specific promoter selected from, the group 
consisting of: the sequence between positions 1 to 1061 
or 447 and 1061 of SEQ ID No. 1 and the sequence between 
positions 1 to 188 of SEQ ID No. 6. 

27. A process to maintain a line of male^sterile plants, 
which comprises the following steps: 

i) crossing 



25 



SUBSTITUTE SHEET (RULE 26) 



wo 95/34634 PCT/EP95/02157 

94 

a) a male-sterile parent plant of said line having in 
a first genetic locus, a homozygous male-sterility 
genotype and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, stably 
intergrated in their nuclear genome: 

- a homozygous male-sterility genotype at a first 
genetic locus ; and 

a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 

i) a fertility-restorer gene capable of 
preventing the phenotypic expression of 
said male-sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said 
plant which is capable of producing 
anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in 
the seeds is visible externally, 

ii) obtaining the seeds from said parent plants, and 
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iii) separating on the basis of color, the seeds in which 
no anthocyanin is produced and which grow into male- 
sterile parent plants. 

5 28. A process according to claim 27, wherein the genome 

of said male-sterile parent plant does not contain at 
least one anthocyanin regulatory gene necessary for the 
regulation of anthocyanin biosynthesis in the seeds of 
said plant to produce externally visible anthocyanin in 

10 said seeds. 

29. The process of claim 28, wherein the genome of aid 
male-sterile parent plant contains a first anthocyanin 
regulatory gene and the genome of said maintainer parent 

,15 plant contains a second anthocyanin regulatory gene 

which, when present with said first anthocyanin 
regulatory gene in the genome of a plant is capable of 
conditioning the production of externally visible 
anthocyanin in seeds. 

20 

30. A process to maintain a line of maintainer plants^ 
which comprises the following steps: 

i) crossing: 

25 
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a) a male-sterile parent plant of said line having, in 
a first genetic locus, a homozygous male-sterility 
genotype, and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, stably 
integrated in their nuclear genome: 

- a homozygous male-sterility genotype at a first 
genetic locus; and 

- a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 

i) a fertility-restorer gene capable of preventing 
the phenotypic expression of said male- 
sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in ' the regulation of anthocyanin 
biosynthesis in cells of seeds of said plant 
which is capable of producing anthocyanin at 
least in the seeds of said plant, so that 
anthocyanin production in the seeds is visible 
externally- 
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ii) obtaining the seeds from said male-sterile parent 
plant, and 

iii) separating on the basis of color, the seeds in which 
anthocyanin is produced and which grow into 
maintainer parent plants • 

31. A process according to claim 30, wherein the genome 
of said male-sterile parent plant does not contain at 
least one anthocyanin regulatory gene necessary for the 
regulation of anthocyanin biosynthesis in the seeds of 
said plant to produce externally visible anthocyanin in 
said seeds. 

32. The process of claim 31, wherein the genome of said 
male-sterile parent plant contains a first anthocyanin 
regulatory gene and the genome of said maintainer parent 
plant contains a second anthocyanin regulatory gene 
which, when present with said first anthocyanin 
regulatory gene in the genome of a plant is capable of 
conditioning the production of externally visible 
anthocyanin in seeds. 

33. A kit for maintaining a line of male-sterile or 
maintainer plants, said kit comprising: 
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a) a male-sterile parent plant of said line having, in a 
first genetic locus, a homozygous male-sterility 
genotype, and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, integrated in 
their nuclear genome : . 

- a homozygous male-sterility genotype at a first 
genetic locus; and 

- a- colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a foreign 
DNA comprising: 

i) a fertility-restorer gene capable of 
prevening the phenotypic expression of said 
male-sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said plant 
which is capable of producing anthocyanin at 
least in the seeds of said plant, so that 
anthocyanin production in the seeds is 
visible externally. 
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34. A process according to claim 33, wherein the genome 
of said male-sterile parent plant does not contain at 
least one anthocyanin regulatory gene necessary for the 
regulation of anthocyanin biosynthesis in the seeds of 

5 said plant to produce externally visible anthocyanin in 

said seeds, 

35. The process of claim 34, wherein the genome of " said 
male-sterile parent plant contains a first anthocyanin 

10 regulatory gene and the genome of said maintainer parent 

plant contains a second anthocyanin regulatory gene 
which, when present with said first anthocyanin 
regulatory gene in the genome of a plant is capable fo 
conditioning the production of externally visible 

15 anthocyanin in seeds. 

36. Process to maintain a kit according to claim 33 which 
comprises: 

-20 . - crossing said male-sterile parent plant with said 
maintainer parent plant; 
- obtaining the seeds from said male-sterile parent 
plants and optionally the seeds from said maintainer 
parent plant in which no anthocyanin is produced; and 
25 - optionally growing said seeds into male-sterile parent 

plants and maintainer parent plants. 
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Figure 2 
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